gpt4 book ai didi

r - 通过保持连续三天的行日期来过滤数据帧,最好使用 dplyr

转载 作者:行者123 更新时间:2023-12-05 00:10:33 25 4
gpt4 key购买 nike

我想根据日期列过滤数据框。我想保留至少连续 3 天的行。我想尽可能有效和快速地做到这一点,所以如果有人接近矢量化,那就太好了。

我试图从以下链接中激发自己的灵感,但效果并不理想,因为这是一个不同的问题:

How to filter rows based on difference in dates between rows in R?

我试图用 for 循环来做,我设法在不连续的日期上放置了一个指示器,但它没有给我想要的结果,因为它保留了所有连续的日期,即使它们更少连续超过 3 个。

tf is my dataframe

for(i in 2:(nrow(tf)-1)){

if(tf$Date[i] != tf$Date[i+1] %m+% days(-1)){
if(tf$Date[i] != tf$Date[i-1] %m+% days(1)){
tf$Date[i] = as.Date(0)
}
}

}


我的数据框的前 22 行看起来像这样:
         Date RR.x RR.y    Y
1 1984-10-20 1 10.8 1984
2 1984-11-04 1 12.5 1984
3 1984-11-05 1 7.0 1984
4 1984-11-09 1 22.9 1984
5 1984-11-10 1 24.4 1984
6 1984-11-11 1 19.0 1984
7 1984-11-13 1 5.9 1984
8 1986-10-15 1 10.3 1986
9 1986-10-16 1 18.1 1986
10 1986-10-17 1 11.3 1986
11 1986-11-17 1 14.1 1986
12 2003-10-17 1 7.8 2003
13 2003-10-25 1 7.6 2003
14 2003-10-26 1 5.0 2003
15 2003-10-27 1 6.6 2003
16 2003-11-15 1 26.4 2003
17 2003-11-20 1 10.0 2003
18 2011-10-29 1 10.0 2011
19 2011-11-04 1 11.4 2011
20 2011-11-21 1 9.8 2011
21 2011-11-22 1 5.6 2011
22 2011-11-23 1 20.4 2011

结果应该是:
         Date RR.x RR.y    Y
4 1984-11-09 1 22.9 1984
5 1984-11-10 1 24.4 1984
6 1984-11-11 1 19.0 1984
8 1986-10-15 1 10.3 1986
9 1986-10-16 1 18.1 1986
10 1986-10-17 1 11.3 1986
13 2003-10-25 1 7.6 2003
14 2003-10-26 1 5.0 2003
15 2003-10-27 1 6.6 2003
20 2011-11-21 1 9.8 2011
21 2011-11-22 1 5.6 2011
22 2011-11-23 1 20.4 2011

最佳答案

一种可能是:

df %>%
mutate(Date = as.Date(Date, format = "%Y-%m-%d"),
diff = c(0, diff(Date))) %>%
group_by(grp = cumsum(diff > 1 & lead(diff, default = last(diff)) == 1)) %>%
filter(if_else(diff > 1 & lead(diff, default = last(diff)) == 1, 1, diff) == 1) %>%
filter(n() >= 3) %>%
ungroup() %>%
select(-diff, -grp)

Date RR.x RR.y Y
<date> <int> <dbl> <int>
1 1984-11-09 1 22.9 1984
2 1984-11-10 1 24.4 1984
3 1984-11-11 1 19 1984
4 1986-10-15 1 10.3 1986
5 1986-10-16 1 18.1 1986
6 1986-10-17 1 11.3 1986
7 2003-10-25 1 7.6 2003
8 2003-10-26 1 5 2003
9 2003-10-27 1 6.6 2003
10 2011-11-21 1 9.8 2011
11 2011-11-22 1 5.6 2011
12 2011-11-23 1 20.4 2011

关于r - 通过保持连续三天的行日期来过滤数据帧,最好使用 dplyr,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57491292/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com