gpt4 book ai didi

r - 根据子组为组分配值

转载 作者:行者123 更新时间:2023-12-02 02:16:19 25 4
gpt4 key购买 nike

在 R 中,我有一个看起来有点像这样的 df:

structure(
list(
`Family ID` = c("1", "1", "1", "2", "2", "2","3", "3", "3", "3", "4", "4", "4", "4"),
`Subject ID` = c("1","2", "4", "1", "2", "4", "1", "2", "4", "5", "1", "2", "4", "5"),
X = c("1", "2", "1", "1", "2", "2", "2", "1", "2", "1", "1","2", "2", "2"),
Y = c("1", "2", "2", "1", "2", "2", "1", "1","2", "2", "2", "1", "2", "2")
), row.names = 2:15, class = "data.frame"
)

#> Family ID Subject ID X Y
#> 2 1 1 1 1
#> 3 1 2 2 2
#> 4 1 4 1 2
#> 5 2 1 1 1
#> 6 2 2 2 2
#> 7 2 4 2 2
#> 8 3 1 2 1
#> 9 3 2 1 1
#> 10 3 4 2 2
#> 11 3 5 1 2
#> 12 4 1 1 2
#> 13 4 2 2 1
#> 14 4 4 2 2
#> 15 4 5 2 2

reprex package 创建于 2021-04-15 (v0.3.0)

我的目标是为所有具有相同家庭 ID 的人创建一个包含值 1 的新列,当且仅当主题 ID 为 4 或 5 在 x 列或 y 列中包含值 1 时。因此,此示例中的结果如下所示:

#>    Family ID Subject ID X Y Z
#> 2 1 1 1 1 1
#> 3 1 2 2 2 1
#> 4 1 4 1 2 1
#> 5 2 1 1 1 0
#> 6 2 2 2 2 0
#> 7 2 4 2 2 0
#> 8 3 1 2 1 1
#> 9 3 2 1 1 1
#> 10 3 4 2 2 1
#> 11 3 5 1 2 1
#> 12 4 1 1 2 0
#> 13 4 2 2 1 0
#> 14 4 4 2 2 0
#> 15 4 5 2 2 0

reprex package 创建于 2021-04-15 (v0.3.0)

在此感谢任何帮助。提前道歉,因为我是新手。

最佳答案

按“FamilyID”分组后,子集 SubjectID 为 4 或 5 的“X”、“Y”列,检查 任何 值是否等于 1 和复合逻辑表达式与 OR (|) 运算符连接

library(dplyr)
df1 %>%
group_by(FamilyID) %>%
mutate(Z = +(any(X[SubjectID %in% 4:5] == 1)|
any(Y[SubjectID %in% 4:5] == 1))) %>%
ungroup

-输出

# A tibble: 13 x 5
# FamilyID SubjectID X Y Z
# <int> <int> <int> <int> <int>
# 1 1 1 1 1 1
# 2 1 2 2 2 1
# 3 1 4 1 2 1
# 4 2 1 1 1 0
# 5 2 2 2 2 0
# 6 3 1 2 1 1
# 7 3 2 1 1 1
# 8 3 4 2 2 1
# 9 3 5 1 2 1
#10 4 1 2 2 0
#11 4 2 2 2 0
#12 4 4 2 2 0
#13 4 5 2 2 0

或者使用base R

df1$Z <- with(df1, +(FamilyID %in% FamilyID[SubjectID %in% 
4:5][rowSums(cbind(X, Y)[SubjectID %in% 4:5,] == 1) > 0]))
df1$Z
#[1] 1 1 1 0 0 1 1 1 1 0 0 0 0

数据

df1 <- structure(list(FamilyID = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 3L, 
4L, 4L, 4L, 4L), SubjectID = c(1L, 2L, 4L, 1L, 2L, 1L, 2L, 4L,
5L, 1L, 2L, 4L, 5L), X = c(1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L,
2L, 2L, 2L, 2L), Y = c(1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L)), class = "data.frame", row.names = c(NA, -13L))

关于r - 根据子组为组分配值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67114018/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com