% -6ren">
gpt4 book ai didi

r - 使用频率列将宽转换为长

转载 作者:行者123 更新时间:2023-12-01 12:19:40 25 4
gpt4 key购买 nike

我正在尝试将我的 data.frame 从宽表转换为带有频率列的长表。

data("UCBAdmissions")
ucb_admit <- as.data.frame(UCBAdmissions)
ucb_admit
Admit Gender Dept Freq
1 Admitted Male A 512
2 Rejected Male A 313
3 Admitted Female A 89
4 Rejected Female A 19
...

而且我想收集这些数据(tidyr 包,类似于从 reshape 中熔化),但使用 Freq 来指定该行应该重复多少次。

因此,我的目标数据看起来像:
     Admit      Gender Dept
1 Admitted Male A
2 Admitted Male A
3 Admitted Male A
4 Admitted Male A
5 Admitted Male A
6 Admitted Male A
...
4523 Rejected Female F
4524 Rejected Female F
4525 Rejected Female F
4526 Rejected Female F

我想使用 tidyr::gather() 来执行此操作,但是我的结果不正确,因为我不确定是否/如何包含 Freq 列?

谢谢

最佳答案

这看起来不像 gather 的工作因为数据是聚合的,而不是广泛的。您可以通过重复行索引 Freq 来使用索引“分解”数据。每行的次数。以下是使用基础 R 和 dplyr 的方法.

library(dplyr)

# Base R
ucb_admit_disagg = ucb_admit[rep(1:nrow(ucb_admit), ucb_admit$Freq),
-grep("Freq", names(ucb_admit))]

# dplyr
ucb_admit_disagg = ucb_admit %>%
slice(rep(1:n(), Freq)) %>%
select(-Freq)

这是数据框的一部分。我在输出中添加了省略号以标记行序列中的中断。
ucb_admit_disagg[c(1:6, 510:514, 4523:4526), ]

          Admit Gender Dept
1 Admitted Male A
1.1 Admitted Male A
1.2 Admitted Male A
1.3 Admitted Male A
1.4 Admitted Male A
1.5 Admitted Male A
...
1.509 Admitted Male A
1.510 Admitted Male A
1.511 Admitted Male A
2 Rejected Male A
2.1 Rejected Male A
...
24.313 Rejected Female F
24.314 Rejected Female F
24.315 Rejected Female F
24.316 Rejected Female F

关于r - 使用频率列将宽转换为长,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45445919/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com