gpt4 book ai didi

r - 使用 tidyverse 从选择性 "Per Day"数字创建 "Per Month"行

转载 作者:行者123 更新时间:2023-12-05 03:00:28 24 4
gpt4 key购买 nike

我有一组销售报告,其中包含报告“每天”或“每月”销售数据的商店。当我将它们绘制在同一张图表上时,“每月”数字看起来像尖峰,使图表难以理解。

我希望将那些“每月一次”的数字转换为在一个月中的几天内平均分布,这样我就可以绘制每日销售图。

我设法使用 tidyverse、lubridate 来计算数据集中的“sales_per_day”列。如何创建“每天 1 行”的行,即。对于 2019-01,从每月 1 行数据中创建 30 天的行?

sales <- tibble(
distributor = c("StoreA", "StoreA", "StoreA", "StoreA", "StoreB"),
sales = c(100,200,300,400,5000),
date = c("2019-01-01", "2019-01-02", "2019-01-03", "2019-01-04", "2019-01-30"),
freq = c("daily", "daily", "daily", "daily", "monthly"))

> sales
# A tibble: 5 x 4
distributor sales date freq
<chr> <dbl> <chr> <chr>
1 StoreA 100 2019-01-01 daily
2 StoreA 200 2019-01-02 daily
3 StoreA 300 2019-01-03 daily
4 StoreA 400 2019-01-04 daily
5 StoreB 5000 2019-01-30 monthly


wanted_sales <- tibble(
distributor = c("StoreA", "StoreA", "StoreA", "StoreA", "StoreB", "StoreB", "StoreB", "StoreB"),
sales = c(100, 200, 300, 400, 5000 / 30, 5000 / 30, 5000 / 30, 5000 / 30),
date = c("2019-01-01", "2019-01-02", "2019-01-03", "2019-01-04", "2019-01-01", "2019-01-02", "2019-01-03", "2019-01-04"),
freq = c("daily", "daily", "daily", "daily", "daily", "daily", "daily", "daily" ))

> wanted_sales
# A tibble: 8 x 4
distributor sales date freq
<chr> <dbl> <chr> <chr>
1 StoreA 100 2019-01-01 daily
2 StoreA 200 2019-01-02 daily
3 StoreA 300 2019-01-03 daily
4 StoreA 400 2019-01-04 daily
5 StoreB 167. 2019-01-01 daily
6 StoreB 167. 2019-01-02 daily
7 StoreB 167. 2019-01-03 daily
8 StoreB 167. 2019-01-04 daily

per_day <- sales %>% filter(freq == "monthly") %>%
group_by(date) %>%
mutate(mdays = as.integer(days_in_month(as_date(date)))) %>%
mutate(sales_per_day = sales / mdays)

> per_day
# A tibble: 1 x 6
# Groups: date [1]
distributor sales date freq mdays sales_per_day
<chr> <dbl> <chr> <chr> <int> <dbl>
1 StoreB 5000 2019-01-30 monthly 31 161.

我希望生成 per_day tibble,有 30 行,$date 列是“2019-01-01”、“2019-01-02”...“2019-01-30”的序列。

最佳答案

我们可以将 date 更改为实际的 Date 类,并创建一个新列 startdate 如果 freq,它将包含该特定月份的第一天不是 "daily" 并且 sales 除以 30。对于每个 date 我们使用 complete 创建序列日期并将所有的 freq 更改为 "daily"

library(dplyr)
library(tidyr)
library(lubridate)

sales %>%
mutate(date = as.Date(date),
startdate = if_else(freq == "daily", date, floor_date(date, "month")),
sales = if_else(freq == "daily", sales, sales/30)) %>%
group_by(date) %>%
complete(date = seq(startdate, date, "1 day"), sales = sales,
freq = "daily", distributor = distributor) %>%
select(-startdate)

# Groups: date [30]
# date sales freq distributor
# <date> <dbl> <chr> <chr>
# 1 2019-01-01 100 daily StoreA
# 2 2019-01-02 200 daily StoreA
# 3 2019-01-03 300 daily StoreA
# 4 2019-01-04 400 daily StoreA
# 5 2019-01-01 167. daily StoreB
# 6 2019-01-02 167. daily StoreB
# 7 2019-01-03 167. daily StoreB
# 8 2019-01-04 167. daily StoreB
# 9 2019-01-05 167. daily StoreB
#10 2019-01-06 167. daily StoreB
# … with 25 more rows

关于r - 使用 tidyverse 从选择性 "Per Day"数字创建 "Per Month"行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56770224/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com