gpt4 book ai didi

R在频率表中保留零长度组

转载 作者:行者123 更新时间:2023-12-04 10:41:23 24 4
gpt4 key购买 nike

我有以下数据集:

gender<-c('male' ,'male', 'male', 'male','male',
'female', 'female', 'female','female', 'female' ,'female', 'female', 'female','female')
clothes<-c('hat', 'hat', 'hat', 'shirt', 'shirt', 'hat', 'hat', 'hat', 'shirt', 'shirt', 'shirt', 'dress', 'dress', 'dress')
color<-c('blue', 'blue', 'green', 'blue', 'brown', 'green', 'brown', 'brown', 'blue', 'green', 'green', 'blue', 'green', 'green')
x<-data.frame(gender, clothes, color)

我需要按颜色按衣服制作性别频率表,NA 仅用于缺少颜色。性别和所有衣服级别都应与 3 个颜色级别相关联。但是,对于一个性别级别,我缺少衣服级别的“衣服”,并且我不希望它填充 NA,我希望将其完全省略。

我试着计算:
x$color<-as.factor(x$color)
x_agg<-x%>%
group_by(gender, clothes, color)%>%
tally()

而这根本达不到目的;对于缺少任何变量的级别,我没有得到 NA。

当我应用以下代码时:
library(tidyverse)
x_agg<-x%>%
group_by(gender, clothes, color)%>%
summarise(cnt=n())%>%
ungroup() %>%
complete(gender, clothes, color,
fill = list(N = 0))

我得到男性的 NA - 连衣裙 - 所有颜色。但我想要的只是最后一个分组变量(颜色)的 NA,而不是衣服和颜色。像这样:
gender<-c('male' ,'male', 'male', 'male','male','male',
'female', 'female', 'female','female', 'female' ,'female', 'female', 'female','female')
clothes<-c('hat', 'hat', 'hat', 'shirt', 'shirt', 'shirt',
'hat', 'hat', 'hat', 'shirt', 'shirt', 'shirt', 'dress', 'dress', 'dress')
color<-c('blue', 'green', 'brown',
'blue', 'green', 'brown',
'blue', 'green', 'brown',
'blue', 'green', 'brown',
'blue', 'green', 'brown')
cnt<-c(2, 1, NA, 1, NA, 1, NA, 1, 2, 1, 2, NA, 1, 2, NA)
x_agg1<-data.frame(gender, clothes, color, cnt)

或者这是一张图片:
enter image description here

我想我尝试了我能想到的一切。有关于堆栈溢出的建议,但都与仅按一个变量分组或为每个单个分组变量的所有级别填充 NA 相关。但是如果只需要填充/保留一个变量的级别,则不清楚该怎么办?有什么建议?

最佳答案

而不是 group_by/summarise ,我们也可以做count

library(dplyr)
library(tidyr)
x %>%
count(gender, clothes, color) %>%
group_by(gender, clothes) %>%
complete(color)
# A tibble: 15 x 4
# Groups: gender, clothes [6]
# gender clothes color n
# <fct> <fct> <fct> <int>
# 1 female dress blue 1
# 2 female dress brown NA
# 3 female dress green 2
# 4 female hat blue NA
# 5 female hat brown 2
# 6 female hat green 1
# 7 female shirt blue 1
# 8 female shirt brown NA
# 9 female shirt green 2
#10 male hat blue 2
#11 male hat brown NA
#12 male hat green 1
#13 male shirt blue 1
#14 male shirt brown 1
#15 male shirt green NA

关于R在频率表中保留零长度组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59911862/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com