gpt4 book ai didi

r - 通过列名将group_by与mutate_if一起使用

转载 作者:行者123 更新时间:2023-12-04 11:03:34 25 4
gpt4 key购买 nike

我正在尝试使用mutate_if基于变量名称执行计算。例如,如果变量名称包含“demo”,则计算平均值,如果变量名称包含“meas”,则计算中位数:

library(tidyverse)
library(stringr)

exm_data <- data_frame(
group = sample(letters[1:5], size = 50, replace = TRUE),
demo_age = rnorm(50),
demo_height = runif(50, min = 48, max = 80),
meas_score1 = rnorm(50),
meas_score2 = rnorm(50)
)
exm_data
#> # A tibble: 50 x 5
#> group demo_age demo_height meas_score1 meas_score2
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 a -1.46539563 58.22435 -0.760692567 0.1077901
#> 2 b 1.90983770 56.57976 0.262933462 -1.0186600
#> 3 c 0.58502114 66.26322 2.283491647 0.3215542
#> 4 b -0.97228337 74.82932 2.447551824 -0.4763201
#> 5 a 0.65814161 72.19627 -0.592671739 -0.0521247
#> 6 c -0.62133706 75.49976 0.005813255 -0.4195284
#> 7 b 0.40650836 60.99083 0.809183477 -0.1127530
#> 8 c -0.48251421 50.94077 -1.171749420 1.7268231
#> 9 b 1.24476630 71.39803 1.786950340 0.7980217
#> 10 c -0.09704469 69.52001 -0.511872217 -1.1465523
#> # ... with 40 more rows


exm_data %>%
mutate_if(str_detect(colnames(.), "demo"), mean) %>%
mutate_if(str_detect(colnames(.), "meas"), median)
#> # A tibble: 50 x 5
#> group demo_age demo_height meas_score1 meas_score2
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 a -0.03250753 64.31412 -0.09909911 0.1307904
#> 2 b -0.03250753 64.31412 -0.09909911 0.1307904
#> 3 c -0.03250753 64.31412 -0.09909911 0.1307904
#> 4 b -0.03250753 64.31412 -0.09909911 0.1307904
#> 5 a -0.03250753 64.31412 -0.09909911 0.1307904
#> 6 c -0.03250753 64.31412 -0.09909911 0.1307904
#> 7 b -0.03250753 64.31412 -0.09909911 0.1307904
#> 8 c -0.03250753 64.31412 -0.09909911 0.1307904
#> 9 b -0.03250753 64.31412 -0.09909911 0.1307904
#> 10 c -0.03250753 64.31412 -0.09909911 0.1307904
#> # ... with 40 more rows

如您所见,这项工作按预期进行。但是,我想按组进行这些计算,当我添加 group_by语句时,它会中断:

exm_data %>%
group_by(group) %>%
mutate_if(str_detect(colnames(.), "demo"), mean) %>%
mutate_if(str_detect(colnames(.), "meas"), median)
#> Error: length(.p) == length(vars) is not TRUE

有没有一种方法可以在使用列名的分组tibble上使用 mutate_if

最佳答案

您可以按以下方式将mutate_at连同contains中的dplyr一起使用,

library(dplyr)

exm_data %>%
group_by(group) %>%
mutate_at(vars(contains('demo')), funs(mean)) %>%
mutate_at(vars(contains('meas')), funs(median))

这使,

# A tibble: 50 x 5
# Groups: group [5]
group demo_age demo_height meas_score1 meas_score2
<chr> <dbl> <dbl> <dbl> <dbl>
1 d 0.12916082 60.26550 0.1932882 -0.5356818
2 b -0.31142894 64.50839 0.3219514 -0.4777860
3 b -0.31142894 64.50839 0.3219514 -0.4777860
4 a -0.34373403 64.84180 0.1929516 -0.3821047
5 a -0.34373403 64.84180 0.1929516 -0.3821047
6 b -0.31142894 64.50839 0.3219514 -0.4777860
7 d 0.12916082 60.26550 0.1932882 -0.5356818
8 a -0.34373403 64.84180 0.1929516 -0.3821047
9 d 0.12916082 60.26550 0.1932882 -0.5356818
10 c -0.05963747 59.07845 -0.2395409 -0.4484245


奖励您不需要加载 stringr

关于r - 通过列名将group_by与mutate_if一起使用,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46607352/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com