gpt4 book ai didi

R 在 R 中使用引用时间表过滤数据

转载 作者:行者123 更新时间:2023-12-02 01:02:40 27 4
gpt4 key购买 nike

我有一个数据框(下面的示例),它有一个时间和另外两个变量

data<- data.frame(structure(list(datetime = c("7/17/2017 8:16:53", "7/17/2017 8:16:55", 
"7/17/2017 8:16:57", "7/17/2017 8:16:59", "7/17/2017 8:17:01",
"7/17/2017 8:17:02", "7/17/2017 8:17:04", "7/17/2017 8:17:06",
"7/17/2017 8:17:08", "7/17/2017 8:17:10", "7/17/2017 8:17:12",
"7/17/2017 8:17:13", "7/17/2017 8:17:15", "7/17/2017 8:17:17",
"7/17/2017 8:17:19", "7/17/2017 8:17:21", "7/17/2017 8:17:22",
"7/17/2017 8:17:27", "7/17/2017 8:17:29", NA, NA), var1 = c(252.234873,
254.0436836, 252.5279108, 252.4802478, 252.6377229, 253.8766496,
249.8086397, 249.5646219, 249.1815691, 253.9509387, 251.7245156,
251.8415925, 254.2059507, 253.9145112, 251.8415925, 254.2059507,
253.9145112, 252.4802478, 252.6377229, NA, NA), var2 = c(582.5766695,
583.0972735, 582.7872586, 582.312636, 579.6445667, 579.7995196,
578.9574528, 576.5341483, 575.8460797, 574.2353493, 574.8998519,
574.1717159, 573.8133058, 574.6849578, 574.1717159, 573.8133058,
574.6849578, 582.312636, 579.6445667, NA, NA)), .Names = c("datetime",
"var1", "var2"), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-21L), spec = structure(list(cols = structure(list(datetime = structure(list(), class = c("collector_character",
"collector")), var1 = structure(list(), class = c("collector_double",
"collector")), var2 = structure(list(), class = c("collector_double",
"collector"))), .Names = c("datetime", "var1", "var2")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec")))

我想根据时间变量将我的数据过滤到不同的时期。我在下面的示例中列出了 From 和 To 之间的时间段

tab_filt <- data.frame(structure(list(From = c("7/17/2017 8:16:53", "7/17/2017 8:17:04", 
"7/17/2017 8:17:19"), To = c("7/17/2017 8:16:59", "7/17/2017 8:17:10",
"7/17/2017 8:17:27")), .Names = c("From", "To"), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -3L), spec = structure(list(
cols = structure(list(From = structure(list(), class = c("collector_character",
"collector")), To = structure(list(), class = c("collector_character",
"collector"))), .Names = c("From", "To")), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec")))

为了减轻您的帮助,我还将示例数据的时间转换为 Posixct

data$datetime <- as.POSIXct(strptime(data$datetime, format="%m/%d/%Y %H:%M:%S"))
tab_filt$From <- as.POSIXct(strptime(tab_filt$From, format="%m/%d/%Y %H:%M:%S"))
tab_filt$To <- as.POSIXct(strptime(tab_filt$To, format="%m/%d/%Y %H:%M:%S"))

我想知道如何只过滤第二个表中的时间段的数据。请帮忙

如果您需要任何其他详细信息,请告诉我:)

最佳答案

这里有一个使用 packge lubridate 的巧妙方法:

library(lubridate)
library(dplyr)

# create intervals using %--%
ints <- tab_filt$From %--% tab_filt$To

# check for each row if datetime lies in any of the intervals using %within%
data %>%
rowwise() %>%
mutate(In = any(datetime %within% ints))

这导致

# A tibble: 21 x 4
datetime var1 var2 In
<dttm> <dbl> <dbl> <lgl>
1 2017-07-17 08:16:53 252. 583. TRUE
2 2017-07-17 08:16:55 254. 583. TRUE
3 2017-07-17 08:16:57 253. 583. TRUE
4 2017-07-17 08:16:59 252. 582. TRUE
5 2017-07-17 08:17:01 253. 580. FALSE
6 2017-07-17 08:17:02 254. 580. FALSE
7 2017-07-17 08:17:04 250. 579. TRUE
8 2017-07-17 08:17:06 250. 577. TRUE
9 2017-07-17 08:17:08 249. 576. TRUE
10 2017-07-17 08:17:10 254. 574. TRUE
# ... with 11 more rows

其中 In = FALSE 表示应删除这些行。为此,只需将 %>% filter(In) 添加到上面的管道即可。

关于R 在 R 中使用引用时间表过滤数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49519077/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com