gpt4 book ai didi

dplyr 中的排名函数

转载 作者:行者123 更新时间:2023-12-02 19:11:50 27 4
gpt4 key购买 nike

我有一个示例 df,如下所示:

d<-structure(list(ReviewType= c("Review","Review","Review","Correction","Correction","Review","Review","Review","Review","Review","Correction","Correction","Deficiency","Correction","Correction", 
"Deficiency", "Deficiency", "Deficiency", "Correction","Correction","Deficiency","Correction"),
Submissiondate= c("2020-08-29 04:32:00","2020-08-28 04:31:00","2020-08-26 04:31:00","2020-08-25 04:31:00","2020-08-24 04:31:00","2020-08-23 04:31:00","2020-08-22 04:31:00","2020-08-21 04:31:00","2020-08-20 04:31:00","2020-08-19 04:31:00",
"2020-09-27 04:31:00","2020-09-27 03:52:59","2020-09-28 17:30:00","2020-09-29 14:01:00",
"2020-09-05 03:00:00","2020-09-05 03:51:00", "2020-09-03 23:59:49",
"2020-09-02 00:03:54","2020-09-01 00:04:48","2020-10-01 04:31:00","2020-10-11 04:31:00","2020-10-21 04:31:00"),
CaseNo= c("124","123","125","121","121","125","123","123","123","123","123","123","123","125","123","123","123","124","123","127","127","127")), class = "data.frame", row.names = c(NA, -22L))

我知道我可以使用排名函数来获取每种情况的最新详细信息,如下所示

d<-d %>% group_by(CaseNo) %>% arrange(desc(Submissiondate)) %>% dplyr::mutate(rank = row_number()) %>% arrange(`CaseNo`, rank)%>%filter(rank==1)

但是,是否可以使用此排名功能来获取特定日期的最新详细信息?例如,在我的 df 中,如果我想了解到 2020 年 8 月 29 日每个案例的最新评论类型是什么?我应该如何继续这样做?

最佳答案

您可能需要考虑使用与最大值的比较,导致更少的计算和更少的打字:

d %>% 
group_by(CaseNo) %>%
filter(Submissiondate==max(Submissiondate))

或者针对您的问题:

d %>% 
filter(as.Date(Submissiondate)<"2020-08-30") %>%
group_by(CaseNo) %>%
filter(Submissiondate==max(Submissiondate))

如果您对短代码和速度感兴趣,这里还有一个 data.table 版本:

D <- data.table(d)

D[,.SD[Submissiondate==max(Submissiondate)], keyby=CaseNo]

D[Submissiondate<"2020-08-30", .SD[Submissiondate==max(Submissiondate)], keyby=CaseNo]

关于dplyr 中的排名函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64178039/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com