I would like to filter OUT variables that match two (or more) conditions in their 'row format', I was wondering how to do that with filter
我想过滤出符合两个(或更多)条件的变量,我想知道如何使用过滤器来实现这一点
In the example below I want to filter out the rows where id is M & from the state of 'CA', as well as when the id is F and from the state of PH.
在下面的例子中,我想从‘CA’状态中筛选出id为M的行,以及当id为F时从PH状态中筛选出来。
The code below doesn't do that...how can I do this? Thanks!
下面的代码不能做到这一点……我怎么做到这一点呢?谢谢!
df <- data.frame(
id = c(10,11,12,13,14,15,16,17),
name = c('sai','ram','deepika','sahithi','kumar','scott','Don','Lin'),
gender = c('M','M',NA,'F','M','M','M','F'),
dob = as.Date(c('1990-10-02','1981-3-24','1987-6-14','1985-8-16',
'1995-03-02','1991-6-21','1986-3-24','1990-8-26')),
state = c('CA','NY',NA,NA,'DC','DW','AZ','PH'),
row.names=c('r1','r2','r3','r4','r5','r6','r7','r8')
)
df %>%
filter(gender != 'M' & state != 'CA',
gender != 'F' & state != 'PH')
更多回答
like this? df %>% filter(!(gender == 'M' & state == 'CA'), !(gender == 'F' & state == 'PH'))
是像这样吗?DF%>%Filter(!(Gender==‘M’&State==‘CA’),!(Gender==‘F’&State==‘PH’))
优秀答案推荐
Because of the NA
s, we need to use %in%
instead of ==
:
由于NAS,我们需要在%中使用%,而不是==:
df %>%
filter(
!(gender %in% 'M' & state %in% 'CA'),
!(gender %in% 'F' & state %in% 'PH')
)
# id name gender dob state
# r2 11 ram M 1981-03-24 NY
# r3 12 deepika <NA> 1987-06-14 <NA>
# r4 13 sahithi F 1985-08-16 <NA>
# r5 14 kumar M 1995-03-02 DC
# r6 15 scott M 1991-06-21 DW
# r7 16 Don M 1986-03-24 AZ
(Otherwise, the rows with NA
s are removed as well.)
(否则,具有NA的行也将被删除。)
The reason this works is that ==
will return NA
if either side is NA
, but %in%
does not:
这样做的原因是,如果任何一方是NA,则==将返回NA,但%不是:
c("F", "M", NA) == "M"
# [1] FALSE TRUE NA
c("F", "M", NA) %in% "M"
# [1] FALSE TRUE FALSE
Another option is to anti_join
them. I find this overkill for this specific scenario, but if you're keeping a table of paired conditions to remove, this might be simpler: maintain a frame of remove
-rows then anti-join them:
另一种选择是反加入他们。对于这个特定的场景,我觉得这有点过头了,但是如果您要删除成对条件的表,这可能会更简单:维护一个删除行的帧,然后反联接它们:
remove <- tibble(gender=c("M", "F"), state=c("CA","PH"))
df %>%
anti_join(remove, by = c("gender", "state"))
# id name gender dob state
# r2 11 ram M 1981-03-24 NY
# r3 12 deepika <NA> 1987-06-14 <NA>
# r4 13 sahithi F 1985-08-16 <NA>
# r5 14 kumar M 1995-03-02 DC
# r6 15 scott M 1991-06-21 DW
# r7 16 Don M 1986-03-24 AZ
更多回答
我是一名优秀的程序员,十分优秀!