% -6ren">
gpt4 book ai didi

r - tidyr:具有不同 NA 计数的多次取消嵌套

转载 作者:行者123 更新时间:2023-12-01 04:56:29 25 4
gpt4 key购买 nike

我对一些整洁的行为感到困惑。我可以像这样取消嵌套一个响应:

library(tidyr)

resp1 <- c("A", "B; A", "B", NA, "B")
resp2 <- c("C; D; F", NA, "C; F", "D", "E")
resp3 <- c(NA, NA, "G; H; I", "H; I", "I")
data <- data.frame(resp1, resp2, resp3, stringsAsFactors = F)

tidy <- data %>%
transform(resp1 = strsplit(resp1, "; ")) %>%
unnest()

# Source: local data frame [6 x 3]
#
# resp2 resp3 resp1
# (chr) (chr) (chr)
# 1 C; D; F NA A
# 2 NA NA B
# 3 NA NA A
# 4 C; F G; H; I B
# 5 D H; I NA
# 6 E I B

但是我需要在我的数据集中取消嵌套多个列,并且这些列具有不同数量的 NA。我试过了,它抛出了一个错误:
data %>%
transform(resp1 = strsplit(resp1, "; "),
resp2 = strsplit(resp2, "; "),
resp3 = strsplit(resp3, "; ")) %>%
unnest()
# Error: All nested columns must have the same number of elements.

我希望上面的代码会给我与以下相同的输出:
# unnesting multiple response (desired output / is there a better way?)
data %>%
transform(resp1 = strsplit(resp1, "; ")) %>%
unnest() %>%
transform(resp2 = strsplit(resp2, "; ")) %>%
unnest() %>%
transform(resp3 = strsplit(resp3, "; ")) %>%
unnest()

# resp1 resp2 resp3
# (chr) (chr) (chr)
# 1 A C NA
# 2 A D NA
# 3 A F NA
# 4 B NA NA
# 5 A NA NA
# 6 B C G
# 7 B C H
# 8 B C I
# 9 B F G
# 10 B F H
# 11 B F I
# 12 NA D H
# 13 NA D I
# 14 B E I

我是 R 的新手,但这感觉很笨拙,让我怀疑我是否在滥用我不应该滥用的东西。多次 unnest 尝试失败是怎么回事?

最佳答案

查询 this link ,这显示了从您的列中取消嵌套多个列的不同情况。根据文档和给出的链接,除非有一些聪明的方法来做到这一点,否则可能只为单个列定义函数以避免歧义。

因此,您可能必须将列一一取消嵌套,下面给出的代码可能仍然很繁琐,但稍微简化了一些。

> resp1 <- c("A", "B; A", "B", NA, "B")
> resp2 <- c("C; D; F", NA, "C; F", "D", "E")
> resp3 <- c(NA, NA, "G; H; I", "H; I", "I")
> data <- data.frame(resp1, resp2, resp3, stringsAsFactors = F)
> data
resp1 resp2 resp3
1 A C; D; F <NA>
2 B; A <NA> <NA>
3 B C; F G; H; I
4 <NA> D H; I
5 B E I
library(tidyr)
library(dplyr)
data %>%
transform(resp1 = strsplit(resp1, "; "),
resp2 = strsplit(resp2, "; "),
resp3 = strsplit(resp3, "; ")) %>%
unnest(resp1) %>% unnest(resp2) %>% unnest(resp3)
resp1 resp2 resp3
1 A C <NA>
2 A D <NA>
3 A F <NA>
4 B <NA> <NA>
5 A <NA> <NA>
6 B C G
7 B C H
8 B C I
9 B F G
10 B F H
11 B F I
12 <NA> D H
13 <NA> D I
14 B E I

关于r - tidyr:具有不同 NA 计数的多次取消嵌套,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36816426/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com