gpt4 book ai didi

r - 通过 data.table 按名称分组的上次销售日期和下次销售日期

转载 作者:行者123 更新时间:2023-12-02 09:30:22 26 4
gpt4 key购买 nike

这是我的数据框的样子:

df <- read.table(text='

Name ActivityType ActivityDate LastSaleDate NextSaleDate
John Email 1/1/2014 NA 2/1/2014
John Sale 2/1/2014 NA 3/1/2014
John Sale 3/1/2014 2/1/2014 NA
John Seminar 4/1/2014 3/1/2014 NA
John Webinar 5/1/2014 3/1/2014 NA
Tom Email 1/1/2014 NA 2/1/2015
Tom Sale 2/1/2015 NA 3/1/2015
Tom Sale 3/1/2015 2/1/2015 NA
Tom Seminar 4/1/2015 3/1/2015 NA
Tom Webinar 5/1/2015 3/1/2015 NA
', header=T)

我正在尝试通过 data.table 导出最右边的两列。我正在查看 ActivityType=Sale 的位置,并查找该销售事件类型的上一个和下一个对应事件日期。相关的 dplyr 解决方案是

library(dplyr)
require(zoo)

df %>%
group_by(Name) %>%
mutate(LastSaleDate=na.locf(lag(ifelse(ActivityType=="Sale",ActivityDate,NA)),na.rm=FALSE))

非常感谢您的帮助。

最佳答案

我将使用library(data.table) 。首先,让我们删除最右边的两列并转换 ActivityDateDate类。

dt <- as.data.table(read.table(text='

Name ActivityType ActivityDate LastSaleDate NextSaleDate
John Email 1/1/2014 NA 2/1/2014
John Sale 2/1/2014 NA 3/1/2014
John Sale 3/1/2014 2/1/2014 NA
John Seminar 4/1/2014 3/1/2014 NA
John Webinar 5/1/2014 3/1/2014 NA
Tom Email 1/1/2014 NA 2/1/2015
Tom Sale 2/1/2015 NA 3/1/2015
Tom Sale 3/1/2015 2/1/2015 NA
Tom Seminar 4/1/2015 3/1/2015 NA
Tom Webinar 5/1/2015 3/1/2015 NA
', header=T))

dt[, c('ActivityDate', 'LastSaleDate', 'NextSaleDate') := list(as.Date(ActivityDate, format = '%d/%m/%Y'), NULL, NULL)]

接下来合并销售数据的数据,以获得所有可能的组合并计算任何事件和销售事件之间的天数差异:

setkeyv(dt, 'Name')
dt2 <- dt[dt[ActivityType == 'Sale'], allow.cartesian = TRUE]
dt2[, DateDiff := as.numeric(ActivityDate - i.ActivityDate)]

获取:

    Name ActivityType ActivityDate i.ActivityType i.ActivityDate DateDiff
1: John Email 2014-01-01 Sale 2014-01-02 -1
2: John Sale 2014-01-02 Sale 2014-01-02 0
3: John Sale 2014-01-03 Sale 2014-01-02 1
4: John Seminar 2014-01-04 Sale 2014-01-02 2
5: John Webinar 2014-01-05 Sale 2014-01-02 3
6: John Email 2014-01-01 Sale 2014-01-03 -2
7: John Sale 2014-01-02 Sale 2014-01-03 -1
8: John Sale 2014-01-03 Sale 2014-01-03 0
9: John Seminar 2014-01-04 Sale 2014-01-03 1
10: John Webinar 2014-01-05 Sale 2014-01-03 2
11: Tom Email 2014-01-01 Sale 2015-01-02 -366
12: Tom Sale 2015-01-02 Sale 2015-01-02 0
13: Tom Sale 2015-01-03 Sale 2015-01-02 1
14: Tom Seminar 2015-01-04 Sale 2015-01-02 2
15: Tom Webinar 2015-01-05 Sale 2015-01-02 3
16: Tom Email 2014-01-01 Sale 2015-01-03 -367
17: Tom Sale 2015-01-02 Sale 2015-01-03 -1
18: Tom Sale 2015-01-03 Sale 2015-01-03 0
19: Tom Seminar 2015-01-04 Sale 2015-01-03 1
20: Tom Webinar 2015-01-05 Sale 2015-01-03 2

现在当你排序dt2 <- dt2[order(Name, ActivityDate, DateDiff)]时您可以通过以下方式获取最后一个和下一个销售日期:

dt2[, list(ActivityType = ActivityType[1],
LastSaleDate = head(i.ActivityDate[DateDiff > 0], 1),
NextSaleDate = tail(i.ActivityDate[DateDiff < 0], 1)),
by = list(Name, ActivityDate)]

关于r - 通过 data.table 按名称分组的上次销售日期和下次销售日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33815773/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com