gpt4 book ai didi

r - 根据多个条件筛选和提取行

转载 作者:行者123 更新时间:2023-12-01 22:55:17 29 4
gpt4 key购买 nike

我有一个由不同诊断的患者组成的大型时间序列数据集。数据集的快照如下:

time<-rep(1:3, times = 5)
ID<-c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5)
Dx1<-c("CBS", "CBS", "CBS", "OtherDx", "OtherDx", "OtherDx", "ACC", "ACC", "ACC", "OtherDx", "OtherDx", "CBS", "OtherDx", "OtherDx", "OtherDx")
Dx2<-c("OtherDx", "OtherDx", "OtherDx", "OtherDx", "OtherDx", "OtherDx", "CBS", "CBS", "CBS", "OtherDx","OtherDx", "OtherDx", "OtherDx","OtherDx", "OtherDx")
df<-tibble(time, ID, Dx1, Dx2)

# A tibble: 15 x 4
ID time Dx1 Dx2
<dbl> <int> <chr> <chr>
1 1 1 CBS OtherDx
2 1 2 CBS OtherDx
3 1 3 CBS OtherDx
4 2 1 OtherDx OtherDx
5 2 2 OtherDx OtherDx
6 2 3 OtherDx OtherDx
7 3 1 ACC CBS
8 3 2 ACC CBS
9 3 3 ACC CBS
10 4 1 OtherDx OtherDx
11 4 2 OtherDx OtherDx
12 4 3 CBS OtherDx
13 5 1 OtherDx OtherDx
14 5 2 OtherDx OtherDx
15 5 3 OtherDx OtherDx

在这里,对于所有三个时间观察,我想过滤并仅保留在 Dx1 和 Dx2 中都具有“OtherDx”的那些 ID。在此快照中,这意味着仅保留 ID 2 和 5(不保留 ID 4,因为在时间 3 处有一个非“OtherDx”值)。

我的 dplyr 代码是:

df2 <- df %>%
group_by(ID, time) %>%
filter(
time== c(1:3) & Dx1== "OtherDx" & Dx2== "OtherDx"
)

但我的代码似乎无法完成这项工作,而且还包含 ID 4。提取这些数据的最佳方法是什么?

最佳答案

您可以使用 if_all()。此条件 if_all(Dx1:Dx2, `==`, "OtherDx") 等效于 Dx1 == "OtherDx"& Dx2 == "OtherDx",并且是如果要识别的Dx越多,越简洁。

library(dplyr)

df %>%
group_by(ID) %>%
filter(all(if_all(Dx1:Dx2, `==`, "OtherDx"))) %>%
ungroup()

# A tibble: 6 × 4
ID time Dx1 Dx2
<dbl> <int> <chr> <chr>
1 2 1 OtherDx OtherDx
2 2 2 OtherDx OtherDx
3 2 3 OtherDx OtherDx
4 5 1 OtherDx OtherDx
5 5 2 OtherDx OtherDx
6 5 3 OtherDx OtherDx

关于r - 根据多个条件筛选和提取行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73400912/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com