gpt4 book ai didi

r - 开始日期和结束日期之间的总和值,对于类别,对于 R 中一段时间​​内的每一天

转载 作者:行者123 更新时间:2023-12-01 11:17:34 25 4
gpt4 key购买 nike

我有一组具有开始和结束日期的任务。每个任务也有一个类别。我想为每个类别指定一个特定的日期范围,并对该日期范围内的所有值求和。我可以接受以宽格式 (results1) 或长格式 (results2) 结尾的结果。如果其中任何一个使这更容易,那对我来说很好。

我试图使下面的示例可重现。

require(lubridate)
require(dplyr)
require(ggplot2)

dates <- seq(from = ymd("2018-01-01"), to = ymd("2018-01-31"), by = "day") %>%
as_data_frame() %>%
rename(Date = value) %>%
arrange(Date)


tasks <- data.frame(
task = c("task 1", "task 2", "task 3", "task 4"),
category = c("cat1", "cat1", "cat2", "cat2"),
start.date = c(ymd("2018-01-01"), ymd("2018-01-15"), ymd("2018-01-18"), ymd("2018-01-25")),
end.date = c(ymd("2018-01-07"), ymd("2018-01-27"), ymd("2018-02-15"), ymd("2018-01-31")),
value = c(1,3,5,7)
)

# desired results example 1: sums in wide format
results1 <- bind_cols(
dates,
cat1 = c(1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0),
cat2 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 5, 5, 5, 5, 5, 12, 12, 12, 12, 12, 12, 12)
)


# desired results example 2: sums in long format
results2 <- bind_cols(
bind_rows(dates, dates),
category = c("cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat1", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2", "cat2"),
value = c(1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 5, 5, 5, 5, 5, 5, 12, 12, 12, 12, 12, 12, 12)
)

#graph the results
ggplot(results2, aes(Date, value, color = category)) + geom_line()

最佳答案

DF1 = do.call(rbind, lapply(split(tasks, tasks$category), function(df1){
do.call(rbind, lapply(dates$Date, function(d){
data.frame(Date = d,
category = df1$category[1],
value = sum(df1$value[d >= df1$start.date & d <= df1$end.date]),
stringsAsFactors = FALSE)
}))
}))
head(DF1)
# Date category value
#cat1.1 2018-01-01 cat1 1
#cat1.2 2018-01-02 cat1 1
#cat1.3 2018-01-03 cat1 1
#cat1.4 2018-01-04 cat1 1
#cat1.5 2018-01-05 cat1 1
#cat1.6 2018-01-06 cat1 1

graphics.off()
ggplot(DF1, aes(Date, value, color = category)) + geom_line()

enter image description here

关于r - 开始日期和结束日期之间的总和值,对于类别,对于 R 中一段时间​​内的每一天,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48569520/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com