gpt4 book ai didi

r - Inner_join 有两个条件和区间内的区间条件

转载 作者:行者123 更新时间:2023-12-01 01:34:58 25 4
gpt4 key购买 nike

尝试根据多个条件和时间间隔条件加入 2 个数据帧,如下例所示:

# two sample dataframes with time intervals
df1 <- data.frame(key1 = c("a", "b", "c", "d", "e"),
key2 = c(1:5),
time1 = as.POSIXct(hms::as.hms(c("00:00:15", "00:15:15", "00:30:15", "00:40:15", "01:10:15"))),
time2 = as.POSIXct(hms::as.hms(c("00:05:15", "00:20:15", "00:35:15", "00:45:15", "01:15:15")))) %>%
mutate(t1 = interval(time1, time2)) %>%
select(key1, key2, t1)

df2 <- data.frame(key1 = c("b", "c", "a", "e", "d"),
key2 = c(2, 6, 1, 8, 5),
sam1 = as.POSIXct(hms::as.hms(c("00:21:15", "00:31:15", "00:03:15", "01:20:15", "00:43:15"))),
sam2 = as.POSIXct(hms::as.hms(c("00:23:15", "00:34:15", "00:04:15", "01:25:15", "00:44:15")))) %>%
mutate(t2 = interval(sam1, sam2)) %>%
select(key1, key2, t2)

首先需要对应的是 key1key2 列,这可以通过以下方式完成(产生错误):

df <- inner_join(df1, df2, by = c("key1", "key2"))

但是加入时还有一个条件需要检查,那就是时间间隔t2是否在t1之内。我可以像这样手动执行此操作:

 df$t2 %within% df$t1

我猜错误是由于以间隔连接数据帧而导致的,这可能不是正确的方法,这就是出现错误的原因。

# desired dataframe
df <- data.frame(key1 = c("a", "b"), key2 = c(1,2), time_condition = c(TRUE, FALSE))

如果 t1 是从 "00:00:15"到 "00:05:15" 那么相应的 t2 是 "00:03:15"到 "00: 04:15" 将在时间间隔 t1 内。如果 t2 在 t1 内,这将导致 time_condition 列为 TRUE,否则为 FALSE。

最佳答案

使用data.table,可以边join边操作。这是一个例子

library(data.table)
df2[df1, # left join
.(time_condition = sam1 > time1 & sam2 < time2), # condition while joining
on = .(key1, key2), # keys
by = .EACHI, # check condition per join
nomatch = 0L] # make it an inner join

# key1 key2 time_condition
# 1: a 1 TRUE
# 2: b 2 FALSE

# your data generated using data.table

df1 <- data.table(key1 = c("a", "b", "c", "d", "e"),
key2 = c(1:5),
time1 = as.ITime(c("00:00:15", "00:15:15", "00:30:15", "00:40:15", "01:10:15")),
time2 = as.ITime(c("00:05:15", "00:20:15", "00:35:15", "00:45:15", "01:15:15")))
df2 <- data.table(key1 = c("b", "c", "a", "e", "d"),
key2 = c(2, 6, 1, 8, 5),
sam1 = as.ITime(c("00:21:15", "00:31:15", "00:03:15", "01:20:15", "00:43:15")),
sam2 = as.ITime(c("00:23:15", "00:34:15", "00:04:15", "01:25:15", "00:44:15")))

关于r - Inner_join 有两个条件和区间内的区间条件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50213099/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com