gpt4 book ai didi

r - 如何根据大部分时间在哪里为新变量编码?

转载 作者:行者123 更新时间:2023-12-02 00:50:56 24 4
gpt4 key购买 nike

我有一个用于 sleep 数据的 DateTime 数据框。我想将“混合”变量更改为“日间” sleep 或“夜间” sleep ,这取决于大部分 sleep 发生在早上 8 点和晚上 8 点,当 sleep 越过截止点时。

    #Current database
id<-c("m1","m1","m1","m2","m2","m2","m3","m4","m4")
x<-c("2020-01-03 10:00:00","2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-05 10:00:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-06 06:30:00","2020-01-08 06:30:00","2020-01-08 07:50:00")
start<-strptime(x,"%Y-%m-%d %H:%M:%S")
y<-c("2020-01-03 16:00:00","2020-01-03 19:20:00","2020-01-03 20:50:00","2020-01-05 15:20:00","2020-01-05 20:50:00","2020-01-05 22:00:00","2020-01-06 07:40:00","2020-01-08 07:50:00","2020-01-08 08:55:00")
end<-strptime(y,"%Y-%m-%d %H:%M:%S")
mydata<-data.frame(id,start,end)

#Current ouput
df1<-mydata %>%
mutate_at(vars(start, end), ymd_hms) %>%
mutate(start_hour = hour(start),
end_hour = hour(end),
day.night = case_when(start_hour >= 8 & end_hour >= 8 & end_hour < 20 ~ "day",
start_hour >= 20 & (end_hour < 8 | end_hour <= 23) |
(start_hour < 8 & end_hour < 8)~ "night",
TRUE ~ "mixed"))



id start end start_hour end_hour day.night
1 m1 2020-01-03 10:00:00 2020-01-03 16:00:00 10 16 day
2 m1 2020-01-03 16:00:00 2020-01-03 19:20:00 16 19 day
3 m1 2020-01-03 19:20:00 2020-01-04 20:50:00 19 20 mixed
4 m2 2020-01-05 10:00:00 2020-01-05 15:20:00 10 15 day
5 m2 2020-01-05 15:20:00 2020-01-05 20:50:00 15 20 mixed
6 m2 2020-01-05 20:50:00 2020-01-05 22:00:00 20 22 night
7 m3 2020-01-06 06:30:00 2020-01-06 07:40:00 6 7 night
8 m4 2020-01-08 06:30:00 2020-01-08 07:50:00 6 7 night
9 m4 2020-01-08 07:50:00 2020-01-08 08:55:00 7 8 mixed

目前,当 sleep 拦截截断时,新变量输出设置为“混合”。

编辑:我希望根据大部分时间花费的时间对 sleep 数据混合的位置进行分类,即在第 3 行:40 分钟是白天,50 分钟是晚上,所以现在 = 晚上,在行中5:4 小时 40 分钟是白天,50 分钟是晚上,所以现在 = 白天。

最佳答案

使用 lubridatedplyr,您可以使用 if_else 对白天和黑夜进行分类。我也根据评论调整了数据。

library(lubridate)
library(dplyr)

#data
id <- c("m1", "m1", "m1", "m2", "m2", "m2", "m3", "m4", "m4")
x <- c("2020-01-03 10:00:00", "2020-01-03 16:00:00", "2020-01-03 19:20:00", "2020-01-05 10:00:00", "2020-01-05 15:20:00", "2020-01-05 20:50:00", "2020-01-06 06:30:00", "2020-01-08 06:30:00", "2020-01-08 07:50:00")
start <- strptime(x, "%Y-%m-%d %H:%M:%S")
y <- c("2020-01-03 16:00:00", "2020-01-03 19:20:00", "2020-01-03 00:50:00", "2020-01-05 15:20:00", "2020-01-05 20:50:00", "2020-01-05 22:00:00", "2020-01-06 07:40:00", "2020-01-08 07:50:00", "2020-01-08 08:55:00")
end <- strptime(y, "%Y-%m-%d %H:%M:%S")
mydata <- data.frame(id, start, end)

#start and end times
daystart <- as.POSIXct('08:00:00', format = "%T")
nightstart <- as.POSIXct('20:00:00', format = "%T")
dayend <- as.POSIXct('19:59:00', format = "%T")
nightend <- as.POSIXct('07:59:00', format = "%T")

df1 <- mydata %>% mutate(start1 = as.POSIXct(sub("\\d+-\\d+-\\d+", Sys.Date(), start)),
end1 = as.POSIXct(sub("\\d+-\\d+-\\d+", Sys.Date(), end)),
day = ifelse(start1 > daystart & start1 < dayend & end1 < dayend & end1 > daystart, as.interval(start1, end1),
ifelse(start1 > daystart & start1 < dayend & end1 < dayend & end1 < daystart, as.interval(start1, dayend),
ifelse(start1 > daystart & start1 < dayend & end1 > dayend, as.interval(start1, dayend),
ifelse(end1 > daystart & end1 < dayend, as.interval(daystart, end1), 0)))),
night = ifelse(end1 > nightstart, as.interval(nightstart, end1),
ifelse(start1 < nightend & end1 > nightend, as.interval(start1, nightend),
ifelse(start1 < nightend & end1 < nightend, as.interval(start1, end1),
ifelse(start1 > nightstart & end1 < nightend, as.interval(start1, end1),
ifelse(start1 < nightstart & end1 < daystart, as.interval(nightstart, end1), 0))))),
day.night = ifelse(abs(day) > abs(night), "day", "night"))

df1 %>% select(names(mydata), day.night)

# id start end day.night
#1 m1 2020-01-03 10:00:00 2020-01-03 16:00:00 day
#2 m1 2020-01-03 16:00:00 2020-01-03 19:20:00 day
#3 m1 2020-01-03 19:20:00 2020-01-03 00:50:00 night
#4 m2 2020-01-05 10:00:00 2020-01-05 15:20:00 day
#5 m2 2020-01-05 15:20:00 2020-01-05 20:50:00 day
#6 m2 2020-01-05 20:50:00 2020-01-05 22:00:00 night
#7 m3 2020-01-06 06:30:00 2020-01-06 07:40:00 night
#8 m4 2020-01-08 06:30:00 2020-01-08 07:50:00 night
#9 m4 2020-01-08 07:50:00 2020-01-08 08:55:00 day

关于r - 如何根据大部分时间在哪里为新变量编码?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57729680/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com