gpt4 book ai didi

R dplyr 按 X 列分组并汇总其余列

转载 作者:行者123 更新时间:2023-12-01 10:23:11 25 4
gpt4 key购买 nike

我使用以下数据集作为示例:

       Age      Gender  CarType     Group   Education
1 46 Male Sedan 1 BS
2 37 Male SUV 1 MS
3 47 Female Sedan 2 PhD
4 20 Male SUV 2 HS
5 41 Male SUV 1 MS
6 52 Male Sedan 2 MS

我的目标是使用 Group 变量进行分组,然后按组显示每列的统计信息。

Group   Male  Female Female-Mean-age Male-Mean-AGE Sedan SUV PhD BS MS
1 3 0 0 41.3 1 2 0 1 2

df %>% group_by(Group) %>% summarise(n = n()) 只是给出计数但是当我尝试为每个性别添加变异和收集计数时我得到错误

df %>% group_by(Group, Gender) %>% summarize(n=n()) %>% mutate(male = count('Male'))

我是否需要将所有列都包含在 group_by 中以便稍后访问求和或计数,或者最好的方法是什么?

最佳答案

一个选择是将收集成“长”格式并获取多列的“计数”,展开到“宽”格式然后做一个加入由“Group”和“Gender”计算的“Age”的mean

library(tidyr)
library(dplyr)
res1 <- gather(df1, key, val, Gender, CarType, Education) %>%
group_by(Group, key, val) %>%
summarise(n = n()) %>%
ungroup %>% select(-key) %>%
spread(val, n, fill = 0)
res2 <- df1 %>%
group_by(Group, Gender) %>%
summarise(Age_Mean = mean(Age)) %>%
mutate(Gender = paste0(Gender, "_Mean")) %>%
spread(Gender, Age_Mean, fill = 0)
left_join(res1, res2)
# A tibble: 2 x 11
# Group BS Female HS Male MS PhD Sedan SUV Female_Mean Male_Mean
# <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 1.00 0 0 3.00 2.00 0 1.00 2.00 0 41.3
#2 2 0 1.00 1.00 2.00 1.00 1.00 2.00 1.00 47.0 36.0

关于R dplyr 按 X 列分组并汇总其余列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49121093/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com