gpt4 book ai didi

r - 仅当特定条件有效时如何在 R 中进行过滤?

转载 作者:行者123 更新时间:2023-12-05 01:50:30 27 4
gpt4 key购买 nike

我有一个包含组的数据集——“A”、“B”、“C”和“A & B”——在两个时间点——“之前”和“之后”。如果任一时间点 A 或 B 的样本量低于 10 人,我只想包括“A 和 B”。否则,我想删除“A & B”组。我如何告诉 R 仅在满足其他条件时才删除该组?

这里有两个样本数据集——一个应该过滤掉 A 组和 B 组,另一个应该保留它:

library(dplyr)

#This should not filter out anything

should_not_drop_group <- tibble(group = rep(c("A", "B", "C", "A & B"), 2),
time = c(rep(c("Before"), 4), rep(c("After"), 4)),
sample_size = c(5, 100, 132, 105, 250, 50, 224, 300))


#This dataset should drop group A&B

should_drop_group <- tibble(group = rep(c("A", "B", "C", "A & B"), 2),
time = c(rep(c("Before"), 4), rep(c("After"), 4)),
sample_size = c(500, 100, 132, 600, 250, 50, 224, 300))

这就是我尝试无济于事的原因:

library(dplyr)

should_drop_group %>%
filter_if(~any(sample_size[group %in% c("A", "B")] < 10), group != "A & B" )

最佳答案

也许 filter 中的条件是 - 子集 group 其中 sample_size 小于 10,检查是否有 该组中 'A'、'B' 的任何 值,求反 (!),然后创建第二个表达式,其中 group 是“A & B” , 用 & 连接它们,然后否定 (!) 整个表达式以过滤掉那些情况

library(dplyr)
should_not_drop_group %>%
filter(!(!any(c("A", "B") %in% group[sample_size < 10]) & group == "A & B"))
# or can be written as
#filter(!(!any(group %in% c("A", "B") & sample_size < 10) & group == "A & B"))

-输出

# A tibble: 8 × 3
group time sample_size
<chr> <chr> <dbl>
1 A Before 5
2 B Before 100
3 C Before 132
4 A & B Before 105
5 A After 250
6 B After 50
7 C After 224
8 A & B After 300

第二种情况

should_drop_group %>% 
filter(!(!any(c("A", "B") %in% group[sample_size < 10]) & group == "A & B"))
# A tibble: 6 × 3
group time sample_size
<chr> <chr> <dbl>
1 A Before 500
2 B Before 100
3 C Before 132
4 A After 250
5 B After 50
6 C After 224

如果我们想在多个数据集上重用它,创建一个函数并重用它

> f1 <- function(x, sample_size) 
!(!any(c("A", "B") %in% x[sample_size < 10]) & x == "A & B")
> should_not_drop_group %>%
filter(if_any(group, f1, sample_size = sample_size))
# A tibble: 8 × 3
group time sample_size
<chr> <chr> <dbl>
1 A Before 5
2 B Before 100
3 C Before 132
4 A & B Before 105
5 A After 250
6 B After 50
7 C After 224
8 A & B After 300
> should_drop_group %>%
filter(if_any(group, f1, sample_size = sample_size))
# A tibble: 6 × 3
group time sample_size
<chr> <chr> <dbl>
1 A Before 500
2 B Before 100
3 C Before 132
4 A After 250
5 B After 50
6 C After 224

关于r - 仅当特定条件有效时如何在 R 中进行过滤?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73014636/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com