gpt4 book ai didi

how to filter out in R variables that need to match differing and multiple conditions(如何筛选出需要匹配不同和多个条件的R变量)

转载 作者:bug小助手 更新时间:2023-10-24 23:00:38 25 4
gpt4 key购买 nike



I would like to filter OUT variables that match two (or more) conditions in their 'row format', I was wondering how to do that with filter

我想过滤出符合两个(或更多)条件的变量,我想知道如何使用过滤器来实现这一点


In the example below I want to filter out the rows where id is M & from the state of 'CA', as well as when the id is F and from the state of PH.

在下面的例子中,我想从‘CA’状态中筛选出id为M的行,以及当id为F时从PH状态中筛选出来。


The code below doesn't do that...how can I do this? Thanks!

下面的代码不能做到这一点……我怎么做到这一点呢?谢谢!


df <- data.frame(
id = c(10,11,12,13,14,15,16,17),
name = c('sai','ram','deepika','sahithi','kumar','scott','Don','Lin'),
gender = c('M','M',NA,'F','M','M','M','F'),
dob = as.Date(c('1990-10-02','1981-3-24','1987-6-14','1985-8-16',
'1995-03-02','1991-6-21','1986-3-24','1990-8-26')),
state = c('CA','NY',NA,NA,'DC','DW','AZ','PH'),
row.names=c('r1','r2','r3','r4','r5','r6','r7','r8')
)

df %>%
filter(gender != 'M' & state != 'CA',
gender != 'F' & state != 'PH')

更多回答

like this? df %>% filter(!(gender == 'M' & state == 'CA'), !(gender == 'F' & state == 'PH'))

是像这样吗?DF%>%Filter(!(Gender==‘M’&State==‘CA’),!(Gender==‘F’&State==‘PH’))

优秀答案推荐


Because of the NAs, we need to use %in% instead of ==:

由于NAS,我们需要在%中使用%,而不是==:


df %>%
filter(
!(gender %in% 'M' & state %in% 'CA'),
!(gender %in% 'F' & state %in% 'PH')
)
# id name gender dob state
# r2 11 ram M 1981-03-24 NY
# r3 12 deepika <NA> 1987-06-14 <NA>
# r4 13 sahithi F 1985-08-16 <NA>
# r5 14 kumar M 1995-03-02 DC
# r6 15 scott M 1991-06-21 DW
# r7 16 Don M 1986-03-24 AZ

(Otherwise, the rows with NAs are removed as well.)

(否则,具有NA的行也将被删除。)


The reason this works is that == will return NA if either side is NA, but %in% does not:

这样做的原因是,如果任何一方是NA,则==将返回NA,但%不是:


c("F", "M", NA) == "M"
# [1] FALSE TRUE NA
c("F", "M", NA) %in% "M"
# [1] FALSE TRUE FALSE

Another option is to anti_join them. I find this overkill for this specific scenario, but if you're keeping a table of paired conditions to remove, this might be simpler: maintain a frame of remove-rows then anti-join them:

另一种选择是反加入他们。对于这个特定的场景,我觉得这有点过头了,但是如果您要删除成对条件的表,这可能会更简单:维护一个删除行的帧,然后反联接它们:


remove <- tibble(gender=c("M", "F"), state=c("CA","PH"))
df %>%
anti_join(remove, by = c("gender", "state"))
# id name gender dob state
# r2 11 ram M 1981-03-24 NY
# r3 12 deepika <NA> 1987-06-14 <NA>
# r4 13 sahithi F 1985-08-16 <NA>
# r5 14 kumar M 1995-03-02 DC
# r6 15 scott M 1991-06-21 DW
# r7 16 Don M 1986-03-24 AZ

更多回答

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com