gpt4 book ai didi

r - 使用 dplyr 计算融化数据上某个值的出现次数

转载 作者:行者123 更新时间:2023-12-02 15:23:58 26 4
gpt4 key购买 nike

melt我的数据之后,我正在尝试做一个简单的table,但是使用dplyr

我的数据是这样的

   cluster   21:30   21:45
4 c alone alone
6 b % %
12 e partner partner
14 b partner partner
20 b alone alone
22 c partner partner

使用 table 我可以简单地

table(dta$cluster)
a b c d e
2 8 5 1 4

如何使用 meltsummarise 获得相同的结果?

 library(dplyr)
library(reshape2)

dta %>%
melt(id.vars = 'cluster') %>%
group_by(cluster) %>%
summarise( n() )

我真正需要的是融化数据后集群。

所以要正确计算这个data.frame

 dta %>% 
melt(id.vars = 'cluster')

预期的输出是这个

      cluster variable   value n_cluster
1 a 21:30 . 2
2 a 21:30 nuclear 2
3 a 21:45 . 2
4 a 21:45 nuclear 2
5 b 21:30 % 8
6 b 21:30 partner 8
7 b 21:30 alone 8
8 b 21:30 partner 8
9 b 21:30 partner 8
10 b 21:30 nuclear 8
11 b 21:30 partner 8
12 b 21:30 partner 8
13 b 21:45 % 8
14 b 21:45 partner 8
15 b 21:45 alone 8
16 b 21:45 partner 8
17 b 21:45 partner 8
18 b 21:45 nuclear 8
19 b 21:45 partner 8
20 b 21:45 partner 8
21 c 21:30 alone 5
22 c 21:30 partner 5
23 c 21:30 % 5
24 c 21:30 partner 5
25 c 21:30 partner 5
26 c 21:45 alone 5
27 c 21:45 partner 5
28 c 21:45 % 5
29 c 21:45 partner 5
30 c 21:45 partner 5
31 d 21:30 partner 1
32 d 21:45 alone 1
33 e 21:30 partner 4
34 e 21:30 nuclear 4
35 e 21:30 nuclear 4
36 e 21:30 nuclear 4
37 e 21:45 partner 4
38 e 21:45 nuclear 4
39 e 21:45 nuclear 4
40 e 21:45 nuclear 4

有什么想法吗?

dta = structure(list(cluster = structure(c(3L, 2L, 5L, 2L, 2L, 3L, 
5L, 3L, 1L, 3L, 1L, 2L, 5L, 3L, 2L, 2L, 2L, 2L, 4L, 5L), .Label = c("a",
"b", "c", "d", "e"), class = "factor"), `21:30` = structure(c(2L,
7L, 5L, 5L, 2L, 5L, 4L, 7L, 1L, 5L, 4L, 5L, 4L, 5L, 5L, 4L, 5L,
5L, 5L, 4L), .Label = c(".", "alone", "children", "nuclear",
"partner", "*", "%"), class = "factor"), `21:45` = structure(c(2L,
7L, 5L, 5L, 2L, 5L, 4L, 7L, 1L, 5L, 4L, 5L, 4L, 5L, 5L, 4L, 5L,
5L, 2L, 4L), .Label = c(".", "alone", "children", "nuclear",
"partner", "*", "%"), class = "factor")), .Names = c("cluster",
"21:30", "21:45"), row.names = c("4", "6", "12", "14", "20",
"22", "23", "28", "30", "32", "36", "38", "40", "42", "44", "48",
"50", "56", "57", "60"), class = "data.frame")

最佳答案

我似乎无法为此找到一个好的骗局,但是一个简单的 dplyr 习惯用法将只使用 count

count(dta, cluster)
# Source: local data frame [5 x 2]
#
# cluster n
# 1 a 2
# 2 b 8
# 3 c 5
# 4 d 1
# 5 e 4

根据您想要的新输出,您可以将此结果加入到您的融化数据集中

dta %>% 
melt(id.vars = 'cluster') %>%
left_join(., count(dta, cluster)) %>%
arrange(cluster)
# cluster variable value n
# 1 a 21:30 . 2
# 2 a 21:30 nuclear 2
# 3 a 21:45 . 2
# 4 a 21:45 nuclear 2
# 5 b 21:30 % 8
# 6 b 21:30 partner 8
# 7 b 21:30 alone 8
#...

关于r - 使用 dplyr 计算融化数据上某个值的出现次数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31974877/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com