gpt4 book ai didi

R - 融合配对数据的 dplyr 交叉表

转载 作者:行者123 更新时间:2023-12-04 10:38:21 25 4
gpt4 key购买 nike

我想知道如何使用 dplyrmelted 数据执行交叉表。我的数据看起来像这样。

         idmen sexe         dip14_rec
1 0110008218 1 Uni
2 0110008218 2 Primary-Secondary
3 0110010366 1 Uni
4 0110010366 2 Uni
5 0110011567 1 Primary-Secondary
6 0110011567 2 Primary-Secondary
7 0110012163 2 Primary-Secondary
8 0110012163 1 Primary-Secondary
9 0110016580 2 Uni
10 0110016580 1 No Diploma

我要的是idmendipl14_rec的交叉表。

我发现的唯一方法是

dta1 = dta %>% filter(sexe == 1) 
dta2 = dta %>% filter(sexe == 2)

dta12 = merge(dta1, dta2, by = 'idmen')
table( Men = dta12$dip14_rec.x, Women = dta12$dip14_rec.y )

这给了我我想要的输出:

#                 Women
# Men No Diploma Primary-Secondary Uni
# No Diploma 0 0 1
# Primary-Secondary 0 2 0
# Uni 0 1 1

是否有更直接的方法使用 dplyr synthax 来做到这一点?

谢谢

dta = structure(c("0110008218", "0110008218", "0110010366", "0110010366", 
"0110011567", "0110011567", "0110012163", "0110012163", "0110016580",
"0110016580", "1", "2", "1", "2", "1", "2", "2", "1", "2", "1",
"Uni", "Primary-Secondary", "Uni", "Uni", "Primary-Secondary",
"Primary-Secondary", "Primary-Secondary", "Primary-Secondary",
"Uni", "No Diploma"), .Dim = c(10L, 3L), .Dimnames = list(NULL,
c("idmen", "sexe", "dip14_rec")))

最佳答案

您可以简单地传播数据并在指定dnn时运行table函数

library(dplyr)
library(tidyr)
dta %>%
spread(sexe, dip14_rec) %>%
select(-idmen) %>%
table(., dnn = c("Men", "Women"))
# Women
# Men No Diploma Primary-Secondary Uni
# No Diploma 0 0 1
# Primary-Secondary 0 2 0
# Uni 0 1 1

或者类似地使用data.table

library(data.table) # V 1.9.6+
dcast(setDT(dta), idmen ~ sexe)[, table(Men = `1`, Women = `2`)]
# Using 'dip14_rec' as value column. Use 'value.var' to override
# Women
# Men No Diploma Primary-Secondary Uni
# No Diploma 0 0 1
# Primary-Secondary 0 2 0
# Uni 0 1 1

数据

dta <- structure(list(idmen = c(110008218L, 110008218L, 110010366L, 
110010366L, 110011567L, 110011567L, 110012163L, 110012163L, 110016580L,
110016580L), sexe = c(1L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L),
dip14_rec = structure(c(3L, 2L, 3L, 3L, 2L, 2L, 2L, 2L, 3L,
1L), .Label = c("No Diploma", "Primary-Secondary", "Uni"), class = "factor")), .Names = c("idmen",
"sexe", "dip14_rec"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

关于R - 融合配对数据的 dplyr 交叉表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34270833/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com