gpt4 book ai didi

r - 聚合嵌套列表对象中不同列表的相似列

转载 作者:行者123 更新时间:2023-12-05 01:54:01 26 4
gpt4 key购买 nike

我有一个包含许多模拟数据迭代的列表。每次迭代都在它自己的 data.frame 中,并保存在命名列表中。例如,我需要聚合每个列表中每个 data.frame 的第一列,每个列表的第二列,第三列等等,每个列都按年龄分组。下面的代码可以重新创建我拥有的列表的结构。

x <- list(list(
list(a = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))),
b = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))),
c = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))),
d = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE)))),

list(a = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))),
b = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))),
c = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))),
d = data.frame(p = rnorm(100), age = sort(sample(seq(5,30),100, replace = TRUE))))
))

我需要的是 pmeansd 对所有名为 a 的列表的 分组>age,同样适用于所有名为 b 的列表,等等。我想要的结果看起来像这样,也许都是 data.table 或其他方便的东西。我只是不能很好地处理嵌套结构来获得我想要的东西。

$a
mean_p sd_p age
1 9.453670 2.034949 5
2 11.881241 1.995676 6
3 9.979276 2.003178 7
4 10.909008 2.104870 8
5 9.338779 1.904653 9
6 11.745993 1.909569 10
7 8.019631 2.050843 11
8 8.875167 2.053025 12
9 10.697181 1.991607 13
10 11.656100 2.005437 14
11 11.960535 2.004246 15
12 10.343899 2.085225 16
13 9.573988 1.975635 17
14 9.038953 2.112180 18
15 9.131533 2.036852 19
16 13.644504 2.160581 20
17 10.284376 1.903301 21
18 9.543758 2.134177 22
19 9.658121 2.202386 23
20 10.633312 1.842427 24
21 11.100520 2.105879 25
22 11.237161 1.871875 26
23 11.530732 1.972589 27
24 9.042670 2.187250 28
25 9.855445 1.970171 29
26 10.649243 2.064264 30

$b
mean_p sd_p age
1 9.705460 1.860338 5
2 10.080478 2.109235 6
3 9.712833 2.017124 7
4 9.420388 2.040863 8
5 11.775058 1.955592 9
6 8.124517 2.046651 10
7 10.557953 1.799830 11
8 10.047775 2.001543 12
9 9.229939 1.966953 13
10 11.814084 2.163710 14
11 12.102374 2.105870 15
12 9.870014 1.866519 16
13 10.696258 2.076030 17
14 9.615747 1.987050 18
15 9.781690 1.961923 19
16 9.395733 1.980549 20
17 13.307485 2.115417 21
18 9.589766 2.058452 22
19 7.942926 2.121072 23
20 9.651580 2.178241 24
21 11.736841 1.996304 25
22 8.682040 1.883955 26
23 10.041262 2.143555 27
24 10.834982 2.086041 28
25 9.046422 2.013758 29
26 9.769026 2.023566 30

$c
mean_p sd_p age
1 10.022880 2.148975 5
2 12.535348 1.913299 6
3 8.431201 2.252942 7
4 9.930989 1.943403 8
5 9.391383 2.004252 9
6 9.217615 1.897260 10
7 10.974630 2.174417 11
8 10.475837 1.935946 12
9 9.291287 1.917856 13
10 9.191117 1.971489 14
11 9.986106 1.940689 15
12 10.249913 1.984423 16
13 10.802905 2.122448 17
14 10.582817 1.843136 18
15 9.197653 1.864674 19
16 10.648420 2.037330 20
17 10.457500 1.885780 21
18 9.291936 2.050027 22
19 11.137871 1.744456 23
20 9.148791 1.907282 24
21 10.157003 2.183199 25
22 12.019497 1.883032 26
23 10.890207 1.922753 27
24 10.305917 2.070391 28
25 9.355486 2.022310 29
26 10.405735 1.920850 30

$d
mean_p sd_p age
1 10.577719 2.157974 5
2 10.557474 2.126788 6
3 9.448008 1.959201 7
4 10.160101 2.021964 8
5 9.664677 2.035892 9
6 10.974770 2.101026 10
7 8.888659 2.026531 11
8 10.185955 2.092113 12
9 10.456310 2.100847 13
10 10.259347 2.091751 14
11 9.150137 2.002525 15
12 11.042025 1.991657 16
13 10.321668 2.102700 17
14 9.537923 1.866761 18
15 10.401667 1.966281 19
16 10.380466 1.934101 20
17 9.947381 1.805547 21
18 10.458567 1.853977 22
19 11.041953 1.970225 23
20 9.826557 1.680464 24
21 10.169353 2.079167 25
22 9.352873 1.907423 26
23 9.084426 2.148295 27
24 10.083584 2.019244 28
25 10.919343 2.099395 29
26 11.621675 2.013150 30

编辑:首先感谢大家的详细反馈。我简化了数据对象的结构,并使用了 Akrun 已在别处提供的一些现有代码。我敢肯定,将来某个地方的某个人可能会发现你们写的一些有用的代码,所以我犹豫是否要结束这个问题,但如果那是我应该做的,那么我会这样做。

最佳答案

我们可以转置(来自purrr)通过循环外部列表(map)将所有'a'元素放在一起, 'b' 在一起,依此类推 ..,然后 flatten list,使用 bind_rows,执行 group_by在 'age' 上并获取 'p' 列的 meansd

library(dplyr)
library(purrr)
map(x, purrr::transpose) %>%
flatten %>%
map(~ bind_rows(.x) %>%
group_by(age) %>%
summarise(mean_p = mean(p), sd_p = sd(p)))

-输出

$a
# A tibble: 26 × 3
age mean_p sd_p
<int> <dbl> <dbl>
1 5 0.182 0.854
2 6 -0.541 0.575
3 7 -0.0815 0.962
4 8 0.372 1.24
5 9 0.495 1.17
6 10 0.528 1.12
7 11 -0.0519 0.696
8 12 0.439 0.627
9 13 0.188 0.465
10 14 0.232 1.28
# … with 16 more rows

$b
# A tibble: 26 × 3
age mean_p sd_p
<int> <dbl> <dbl>
1 5 -0.0386 0.930
2 6 0.0312 0.961
3 7 -0.0914 1.12
4 8 0.218 0.948
5 9 -0.155 0.970
6 10 0.669 1.31
7 11 -0.844 0.971
8 12 0.424 1.21
9 13 0.306 1.36
10 14 -0.0380 0.876
# … with 16 more rows

$c
# A tibble: 26 × 3
age mean_p sd_p
<int> <dbl> <dbl>
1 5 -0.447 1.21
2 6 0.0458 0.919
3 7 -0.0733 1.08
4 8 -0.424 1.32
5 9 -0.149 0.611
6 10 -0.0812 0.650
7 11 -0.182 1.08
8 12 0.513 1.00
9 13 0.466 0.869
10 14 0.587 1.06
# … with 16 more rows

$d
# A tibble: 26 × 3
age mean_p sd_p
<int> <dbl> <dbl>
1 5 0.749 1.17
2 6 0.597 0.971
3 7 0.294 1.36
4 8 0.536 0.842
5 9 0.202 1.13
6 10 -0.267 0.765
7 11 -0.338 1.19
8 12 -0.0775 0.668
9 13 -0.416 1.04
10 14 -0.172 0.943
# … with 16 more rows

关于r - 聚合嵌套列表对象中不同列表的相似列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70856278/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com