% -6ren">
gpt4 book ai didi

r - 使用tidyr::separate拆分多列的整理方法

转载 作者:行者123 更新时间:2023-12-04 10:12:40 29 4
gpt4 key购买 nike

我有一个这样的数据框:

df <- structure(list(A = c("3 of 5", "1 of 2", "1 of 3", "1 of 3", 
"3 of 4", "2 of 7"), B = c("2 of 2", "2 of 4", "0 of 1", "0 of 0",
"0 of 0", "0 of 0"), C = c("10 of 21", "3 of 14", "11 of 34",
"10 of 35", "16 of 53", "17 of 62"), D = c("0 of 0", "0 of 0",
"0 of 0", "0 of 0", "0 of 0", "0 of 0"), E = c("8 of 16", "3 of 15",
"10 of 32", "6 of 28", "13 of 49", "9 of 48")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L))

df

|A |B |C |D |E |
|:------|:------|:--------|:------|:--------|
|3 of 5 |2 of 2 |10 of 21 |0 of 0 |8 of 16 |
|1 of 2 |2 of 4 |3 of 14 |0 of 0 |3 of 15 |
|1 of 3 |0 of 1 |11 of 34 |0 of 0 |10 of 32 |
|1 of 3 |0 of 0 |10 of 35 |0 of 0 |6 of 28 |
|3 of 4 |0 of 0 |16 of 53 |0 of 0 |13 of 49 |
|2 of 7 |0 of 0 |17 of 62 |0 of 0 |9 of 48 |


我想将每一列拆分为2,让我留下这样的东西:

|A_attempted |A_landed |B_attempted |B_landed |C_attempted |C_landed |D_attempted |D_landed |E_attempted |E_landed |
|:-----------|:--------|:-----------|:--------|:-----------|:--------|:-----------|:--------|:-----------|:--------|
|3 |5 |2 |2 |10 |21 |0 |0 |8 |16 |
|1 |2 |2 |4 |3 |14 |0 |0 |3 |15 |
|1 |3 |0 |1 |11 |34 |0 |0 |10 |32 |
|1 |3 |0 |0 |10 |35 |0 |0 |6 |28 |
|3 |4 |0 |0 |16 |53 |0 |0 |13 |49 |
|2 |7 |0 |0 |17 |62 |0 |0 |9 |48 |


到目前为止,我使用的方法是:

df %>% 
separate(A, sep = " of ", remove = T, into = c("A_attempted", "A_landed")) %>%
separate(B, sep = " of ", remove = T, into = c("B_attempted", "B_landed")) %>%
separate(C, sep = " of ", remove = T, into = c("C_attempted", "C_landed")) %>%
separate(D, sep = " of ", remove = T, into = c("D_attempted", "D_landed")) %>%
separate(E, sep = " of ", remove = T, into = c("E_attempted", "E_landed"))


考虑到我有15个变量,这不是很好。我希望使用 map解决方案

这里有一个答案: Apply tidyr::separate over multiple columns但是使用了不推荐使用的函数

最佳答案

可以尝试:

library(tidyverse)

names(df) %>%
map(
function(x)
df %>%
select(x) %>%
separate(x,
into = paste0(x, c("_attempted", "_landed")),
sep = " of ")
) %>%
bind_cols()


输出:

# A tibble: 6 x 10
A_attempted A_landed B_attempted B_landed C_attempted C_landed D_attempted D_landed E_attempted E_landed
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 3 5 2 2 10 21 0 0 8 16
2 1 2 2 4 3 14 0 0 3 15
3 1 3 0 1 11 34 0 0 10 32
4 1 3 0 0 10 35 0 0 6 28
5 3 4 0 0 16 53 0 0 13 49
6 2 7 0 0 17 62 0 0 9 48


正如OP所建议的,我们确实可以避免使用 map_dfc的最后一步:

names(df) %>% 
map_dfc(~ df %>%
select(.x) %>%
separate(.x,
into = paste0(.x, c("_attempted", "_landed")),
sep = " of ")
)

关于r - 使用tidyr::separate拆分多列的整理方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55277610/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com