gpt4 book ai didi

r - 按组确定一列中的日期是否出现在另一列中的日期之后

转载 作者:行者123 更新时间:2023-12-01 11:21:04 25 4
gpt4 key购买 nike

假设我在服务日期后向客户收费,如果他们没有支付账单,我将停止为他们提供服务。但是,服务日期和账单日期之间的时间差使得在客户请求额外服务时难以执行。要确定客户是否拖欠款项,我需要知道新请求的服务日期是否在发送未结帐单之后(可能比服务日期晚得多发送)。

示例数据

df <- structure(list(id = structure(c(1L, 2L, 3L, 4L, 1L, 1L, 2L, 3L, 2L, 2L), .Label = c("A", "B", "C", "D"), class = "factor"), service.date = structure(c(1L, 3L, 5L, 6L, 2L, 9L, 4L, 7L, 8L, 10L), .Label = c("2011-01-01", "2011-01-03", "2011-02-01", "2011-03-01", "2011-03-02", "2011-04-02", "2011-05-09", "2011-08-19", "2011-09-02", "2011-09-10"), class = "factor"), bill.date = structure(c(4L, 5L, 2L, 6L, 9L, 1L, 8L, 10L, 3L, 7L), .Label = c("2011-08-09", "2011-08-10", "2011-08-11", "2011-08-12", "2011-08-13", "2011-08-14", "2011-08-15", "2011-08-16", "2011-08-17", "2011-08-19"), class = "factor")), .Names = c("id", "service.date", "bill.date"), class = "data.frame", row.names = c(NA, -10L))

# df
# id service.date bill.date
# A 2011-01-01 2011-08-12
# B 2011-02-01 2011-08-13
# C 2011-03-02 2011-08-10
# D 2011-04-02 2011-08-14
# A 2011-01-03 2011-08-17
# A 2011-09-02 2011-08-09
# B 2011-03-01 2011-08-16
# C 2011-05-09 2011-08-19
# B 2011-08-19 2011-08-11
# B 2011-09-10 2011-08-15

因此,如果他们在为他们的初始服务发送账单之前请求额外服务,他们将不会被视为拖欠服务。但是,如果他们在账单已发出但仍未付款后要求额外服务,他们就会拖欠费用。

到目前为止的步骤我的想法是使用分组函数,可能像 by(),在因子变量“id”中找到与级别关联的第一个“bill.date”,然后为每个“服务”确定.date”与每个“id”级别相关联,如果它发生在所述“id”级别的关联未完成“bill.date”之后,最终创建一个逻辑变量。这是我希望最终得到的示例:

期望的结果

df$delinquent <- c(FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALSE, TRUE, TRUE)

#df

# id service.date bill.date delinquent
# A 2011-01-01 2011-08-12 FALSE
# B 2011-02-01 2011-08-13 FALSE
# C 2011-03-02 2011-08-10 FALSE
# D 2011-04-02 2011-08-14 FALSE
# A 2011-01-03 2011-08-17 FALSE
# A 2011-09-02 2011-08-09 TRUE
# B 2011-03-01 2011-08-16 FALSE
# C 2011-05-09 2011-08-19 FALSE
# B 2011-08-19 2011-08-11 TRUE
# B 2011-09-10 2011-08-15 TRUE

所以在示例数据中,有四个“客户”(名为 A、B、C 和 D),其中两个将被标记为拖欠服务(A 和 B),尽管他们有未付账单但仍获得服务。

最佳答案

# Load some tidyverse libraries
require(dplyr)

# Convert factor dates to actual dates
df <- df %>% mutate(service.date = as.Date(service.date),
bill.date = as.Date(bill.date))

# If service date is later than earliest bill.date in each group, return delinquent
df %>% group_by(id) %>% mutate(delinquent = service.date > min(bill.date))

关于r - 按组确定一列中的日期是否出现在另一列中的日期之后,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42987069/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com