gpt4 book ai didi

r - 如何计算连续行的时间差

转载 作者:行者123 更新时间:2023-12-03 23:37:20 26 4
gpt4 key购买 nike

原始数据如下所示,我想按访问者和时间对其进行排序,以计算行中的时间差,然后将其保存到新文件中。

  visitor         v_time payment items
1 Jack 1/2/2018 16:07 35 3
2 Jack 1/2/2018 16:09 160 1
3 David 1/2/2018 16:12 25 2
4 Kate 1/2/2018 16:16 3 3
5 David 1/2/2018 16:21 25 5
6 Jack 1/2/2018 16:32 85 5
7 Kate 1/2/2018 16:33 639 3
8 Jack 1/2/2018 16:55 6 2

分组和排序没问题。但是计算时差失败,文件保存也失败。

visitor <- c("Jack", "Jack", "David", "Kate", "David", "Jack", "Kate", "Jack")
v_time <- c("1/2/2018 16:07","1/2/2018 16:09","1/2/2018 16:12","1/2/2018 16:16","1/2/2018 16:21","1/2/2018 16:32","1/2/2018 16:33", "1/2/2018 16:55")
payment <- c(35,160,25,3,25,85,639,6)
items <- c(3,1,2,3,5,5,3,2)
df <- data.frame(visitor, v_time, payment, items)

df %>%
arrange(visitor, v_time) %>%
group_by(visitor) %>%
mutate(diff = strptime(v_time, "%d/%m/%Y %H:%M") - lag(strptime(v_time, "%d/%m/%Y %H:%M")), diff_secs = as.numeric(diff, units = 'secs'))

write.csv(df,"C:/output.csv", row.names = F)

我的错误是什么?正确的做法是什么?

# A tibble: 8 x 6
# Groups: visitor [3]
visitor v_time payment items diff diff_secs
<fct> <fct> <dbl> <dbl> <time> <dbl>
1 David 1/2/2018 16:12 25.0 2.00 NA NA
2 David 1/2/2018 16:21 25.0 5.00 NA NA
3 Jack 1/2/2018 16:07 35.0 3.00 NA NA
4 Jack 1/2/2018 16:09 160 1.00 NA NA
5 Jack 1/2/2018 16:32 85.0 5.00 NA NA
6 Jack 1/2/2018 16:55 6.00 2.00 NA NA
7 Kate 1/2/2018 16:16 3.00 3.00 NA NA
8 Kate 1/2/2018 16:33 639 3.00 NA NA

最佳答案

当您只是将 default = strptime(v_time, "%d/%m/%Y %H:%M")[1] 添加到 lag 部分:

df <- df %>%
arrange(visitor, v_time) %>%
group_by(visitor) %>%
mutate(diff = strptime(v_time, "%d/%m/%Y %H:%M") - lag(strptime(v_time, "%d/%m/%Y %H:%M"), default = strptime(v_time, "%d/%m/%Y %H:%M")[1]),
diff_secs = as.numeric(diff, units = 'secs'))

你会得到你期望的结果:

> df
# A tibble: 8 x 6
# Groups: visitor [3]
visitor v_time payment items diff diff_secs
<fct> <fct> <dbl> <dbl> <time> <dbl>
1 David 1/2/2018 16:12 25. 2. 0 0.
2 David 1/2/2018 16:21 25. 5. 540 540.
3 Jack 1/2/2018 16:07 35. 3. 0 0.
4 Jack 1/2/2018 16:09 160. 1. 120 120.
5 Jack 1/2/2018 16:32 85. 5. 1380 1380.
6 Jack 1/2/2018 16:55 6. 2. 1380 1380.
7 Kate 1/2/2018 16:16 3. 3. 0 0.
8 Kate 1/2/2018 16:33 639. 3. 1020 1020.

另一种选择是使用 difftime:

df <- df %>%
arrange(visitor, v_time) %>%
group_by(visitor) %>%
mutate(diff = difftime(strptime(v_time, "%d/%m/%Y %H:%M"), lag(strptime(v_time, "%d/%m/%Y %H:%M"), default = strptime(v_time, "%d/%m/%Y %H:%M")[1]), units = 'mins'),
diff_secs = as.numeric(diff, units = 'secs'))

现在 diff-column 以分钟为单位,diff_sec-column 以秒为单位:

> df
# A tibble: 8 x 6
# Groups: visitor [3]
visitor v_time payment items diff diff_secs
<fct> <fct> <dbl> <dbl> <time> <dbl>
1 David 1/2/2018 16:12 25. 2. 0 0.
2 David 1/2/2018 16:21 25. 5. 9 540.
3 Jack 1/2/2018 16:07 35. 3. 0 0.
4 Jack 1/2/2018 16:09 160. 1. 2 120.
5 Jack 1/2/2018 16:32 85. 5. 23 1380.
6 Jack 1/2/2018 16:55 6. 2. 23 1380.
7 Kate 1/2/2018 16:16 3. 3. 0 0.
8 Kate 1/2/2018 16:33 639. 3. 17 1020.

您现在可以使用 write.csv(df,"C:/output.csv", row.names = FALSE) 再次保存结果

关于r - 如何计算连续行的时间差,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49003378/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com