gpt4 book ai didi

r - 在 dplyr 中汇总并为没有值的类别插入 0

转载 作者:行者123 更新时间:2023-12-05 01:30:50 24 4
gpt4 key购买 nike

假设您有这样的数据:

set.seed(2021)

age <- floor(runif(35, min = 20, max = 25))

dat <- data.frame(age)

dat %>%
mutate(education = sample(c("Low", "Mid-level", "High"),
size = nrow(dat), prob = c(0.55, 0.2, 0.25), replace = TRUE)) %>%
group_by(age, education) %>%
summarise(n = n())

结果:

     age education     n
<dbl> <chr> <int>
1 20 High 1
2 20 Low 2
3 21 Low 3
4 21 Mid-level 2
5 22 High 2
6 22 Low 4
7 23 Low 4
8 23 Mid-level 2
9 24 High 1
10 24 Low 10
11 24 Mid-level 4

例如,如您所见,20 岁时没有“中级”教育的计数,因此该类别已从数据框中排除。是否可以将该值显示为 0?

例如

# A tibble: 11 x 3
# Groups: age [5]
age education n
<dbl> <chr> <int>
1 20 High 1
2 20 Low 2
3 20 Mid-level 0

最佳答案

除了 group_by 和 summarise,您可以使用 count 和 .drop = FALSE 作为参数。您需要先制作教育列因素,因此您可以尝试在最后添加:

  count(age, as.factor(education), .drop = FALSE) 

编辑:按顺序排列因素以获得更清晰的结果

dat %>%
mutate(education = sample(
c("Low", "Mid-level", "High"),
size = nrow(dat),
prob = c(0.55, 0.2, 0.25),
replace = TRUE
)) %>%
# convert to factor with levels in specified order
mutate(education = factor(education, levels = c("Low", "Mid-level", "High"))) %>%
count(age, education, .drop = FALSE)

结果:

   age education  n
1 20 Low 2
2 20 Mid-level 0
3 20 High 1
4 21 Low 3
5 21 Mid-level 2
6 21 High 0
7 22 Low 4
8 22 Mid-level 0
9 22 High 2
10 23 Low 4
11 23 Mid-level 2
12 23 High 0
13 24 Low 10
14 24 Mid-level 4
15 24 High 1

关于r - 在 dplyr 中汇总并为没有值的类别插入 0,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66724028/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com