gpt4 book ai didi

按应用程序和用户 ID 分组后检索特定文本的所有行

转载 作者:行者123 更新时间:2023-12-04 12:17:09 25 4
gpt4 key购买 nike

当用户以数字方式完成一个步骤时,列 is_digitally_signed 更改为 YES。我想做什么:如果任何步骤以数字方式完成,我想检索相同 application_iduser_id 的所有行。请检查我想要的输出。

用于复制我的数据集的 R 代码

df <- data.table(application_id = c(1,1,1,2,2,2,3,3,3), 
user_id = c(123,123,123,456,456,456,789,789,789),
application_status = c("incomplete", "details_verified", "complete"),
date = c("01/01/2018", "02/01/2018", "03/01/2018"),
is_digitally_signed = c("NULL", "NULL", "YES", "NULL", "NULL", "NULL", "NULL", "YES", "NULL")) %>%
mutate(date = as.Date(date, "%d/%m/%Y"))

有输出

df
application_id user_id application_status date is_digitally_signed
1 123 incomplete 2018-01-01 NULL
1 123 details_verified 2018-01-02 NULL
1 123 complete 2018-01-03 YES
2 456 incomplete 2018-01-01 NULL
2 456 details_verified 2018-01-02 NULL
2 456 complete 2018-01-03 NULL
3 789 incomplete 2018-01-01 NULL
3 789 details_verified 2018-01-02 YES
3 789 complete 2018-01-03 NULL

我的(不成功的)努力

df %>% group_by(application_id,user_id) %>% filter_all(all.vars(. == "YES"))

期望的结果

application_id user_id application_status       date is_digitally_signed
1 123 incomplete 2018-01-01 NULL
1 123 details_verified 2018-01-02 NULL
1 123 complete 2018-01-03 YES
3 789 incomplete 2018-01-01 NULL
3 789 details_verified 2018-01-02 YES
3 789 complete 2018-01-03 NULL

最佳答案

dplyr

我们可以将 filterany 一起使用,它检查给定的组是否至少有一条记录带有 is_digitally_signed == 'YES':

library(dplyr)

df %>%
group_by(application_id, user_id) %>%
filter(any(is_digitally_signed == "YES"))

或使用 all 函数对并非所有 is_digitally_signed == "NULL" 的组进行子集化:

df %>% 
group_by(application_id, user_id) %>%
filter(!all(is_digitally_signed == "NULL"))

数据表

我们还可以使用 data.table,因为您已经将数据作为 DT 加载:

library(data.table)
dt = setDT(df)
dt[dt[,.I[any(is_digitally_signed == "YES")], by=.(application_id, user_id)]$V1,]

或使用.SD:

dt[,.SD[any(is_digitally_signed == "YES")], by=.(application_id, user_id)]

输出:

# A tibble: 6 x 5
# Groups: application_id, user_id [2]
application_id user_id application_status date is_digitally_signed
<dbl> <dbl> <fct> <date> <fct>
1 1 123 incomplete 2018-01-01 NULL
2 1 123 details_verified 2018-01-02 NULL
3 1 123 complete 2018-01-03 YES
4 3 789 incomplete 2018-01-01 NULL
5 3 789 details_verified 2018-01-02 YES
6 3 789 complete 2018-01-03 NULL

关于按应用程序和用户 ID 分组后检索特定文本的所有行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53416759/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com