gpt4 book ai didi

r - 表示每个组内有超过 1 列的组索引

转载 作者:行者123 更新时间:2023-12-02 08:17:11 25 4
gpt4 key购买 nike

我有一个变量,我想获得每个组内的均值,其中为列中的每个观察列出组,我有很多这样的列。然后我想将组均值与适当的观察相关联,这样如果我从 m obs x n 不同分组的矩阵开始,我将获得 m x n 均值矩阵。例如:

> var <- round(runif(10),digits=2)

> var
[1] 0.47 0.21 0.80 0.65 0.32 0.72 0.29 0.93 0.77 0.64
> groupings <- cbind(sample(c(1,2,3), 10, replace=TRUE),
sample(c(1,2,3), 10, replace=TRUE),
sample(c(1,2,3,5), 10, replace=TRUE))
> groupings
[,1] [,2] [,3]
[1,] 3 1 5
[2,] 1 1 5
[3,] 2 1 5
[4,] 3 2 3
[5,] 2 3 1
[6,] 1 1 1
[7,] 2 3 1
[8,] 1 2 1
[9,] 3 1 5
[10,] 1 3 2

我可以通过以下(例如)分别获得每个组内的手段

> means.1 <- sapply(split(var, groupings[,1]), function(x) mean(x))
> means.2 <- sapply(split(var, groupings[,2]), function(x) mean(x))
> means.3 <- sapply(split(var, groupings[,3]), function(x) mean(x))

> means.1
1 2 3
0.625 0.470 0.630
> means.2
1 2 3
0.5940000 0.7900000 0.4166667
> means.3
1 2 3 5
0.5650 0.6400 0.6500 0.5625

但这些单独的调用不仅效率低下,而且仍然无法得到我想要的,如下所示

       [,1]      [,2]   [,3]
[1,] 0.630 0.5940000 0.5625
[2,] 0.625 0.5940000 0.5625
[3,] 0.470 0.5940000 0.5625
[4,] 0.630 0.7900000 0.6500
[5,] 0.470 0.4166667 0.5650
[6,] 0.625 0.5940000 0.5650
[7,] 0.470 0.4166667 0.5650
[8,] 0.625 0.7900000 0.5650
[9,] 0.630 0.5940000 0.5625
[10,] 0.625 0.4166667 0.6400

最佳答案

library(dplyr)
set.seed(1000)
var <- round(runif(10),digits=2)

groupings <- cbind(sample(c(1,2,3), 10, replace=TRUE),
sample(c(1,2,3), 10, replace=TRUE),
sample(c(1,2,3,5), 10, replace=TRUE), var)

df = data.frame(groupings)
df %>%
group_by(V1)%>% mutate(x1 =mean(var))%>% ungroup(V1) %>%
group_by(V2) %>% mutate(x2=mean(var)) %>% ungroup(V2) %>%
group_by(V3) %>% mutate(x3=mean(var)) %>% ungroup(V3)

# V1 V2 V3 var x1 x2 x3
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 2 1 3 0.33 0.4775000 0.322 0.250
#2 3 3 1 0.76 0.6566667 0.470 0.750
#3 1 1 3 0.11 0.1333333 0.322 0.250
#4 3 1 5 0.69 0.6566667 0.322 0.635
#5 3 2 3 0.52 0.6566667 0.630 0.250
#6 1 3 3 0.07 0.1333333 0.470 0.250
#7 2 2 1 0.74 0.4775000 0.630 0.750
#8 2 3 5 0.58 0.4775000 0.470 0.635
#9 1 1 3 0.22 0.1333333 0.322 0.250
#10 2 1 2 0.26 0.4775000 0.322 0.260

# you can simply subset the columns

关于r - 表示每个组内有超过 1 列的组索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41046108/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com