gpt4 book ai didi

r - 展平列作为参数

转载 作者:行者123 更新时间:2023-12-01 10:27:33 25 4
gpt4 key购买 nike

使用 dplyr 创建我的数据摘要时,我经常发现自己计算 CI(使用 CI 来自 Rmisc ):

summary <- data %>%
group_by(group1, group2) %>%
summarize(
var1.mean = CI(var1, ci=0.95)['mean'],
var1.lower = CI(var1, ci=0.95)['lower'],
var1.upper = CI(var1, ci=0.95)['upper'],

var2.mean = CI(var2, ci=0.95)['mean'],
var2.lower = CI(var2, ci=0.95)['lower'],
var3.upper = CI(var2, ci=0.95)['upper'],

var3.mean = CI(var3, ci=0.95)['mean'],
var3.lower = CI(var3, ci=0.95)['lower'],
var3.upper = CI(var3, ci=0.95)['upper'],

var4 = sum(var4)
)

这既冗长又低效。最后,我希望我能写一些像这样的东西:

summary <- data %>%
group_by(group1, group2) %>%
summarize(
var1 = CI(var1, ci=0.95),
var2 = CI(var2, ci=0.95),
var3 = CI(var3, ci=0.95),
var4 = sum(var4)
)

对于上面的代码,自CI返回包含行的命名列

  • “降低”,
  • “上”
  • “意思”,

我希望我能得到一个列如下所示的数据框:

  • "group1",
  • "group2",
  • "var1.lower",
  • "var1.mean",
  • "var1.upper",
  • "var2.lower",
  • ...,
  • "var3.upper",
  • “var4”

知道如何实现吗?有没有办法在 R 中“展平”列?类似于 do.call 但在 JS 或 Python 中像 rest 一样应用?

使用 quasiquotations 可能会有一些事情要做,但它开始超越我的 R 技能..

我以前用 this gistplyr , 但它不再适用于 dplyr , 而不是重新编码,我希望有比侵入图书馆更好的方法。

最佳答案

如果我们先将输出格式化为data.frame,我们就可以使用tidyr::unnest

数据

library(Rmisc)
library(dplyr)
library(tidyr)
set.seed(1)
data <- data.frame(group1 = sample(c("A","B"),10,T),
group2 = sample(c("A","B"),10,T),
var1 = sample(10),
var2 = sample(10),
var3 = sample(10),
var4 = sample(10))

通用解决方案

data %>% group_by(group1, group2) %>%
dplyr::summarize(var1 = list(data.frame(t(CI(var1, ci=0.95)))),
var2 = list(data.frame(t(CI(var2, ci=0.95)))),
var3 = list(data.frame(t(CI(var3, ci=0.95)))),
var4 = sum(var4)) %>%
unnest (var1,var2,var3,.sep=".")

结果

# A tibble: 4 x 12
# Groups: group1 [2]
# group1 group2 var4 var1.upper var1.mean var1.lower var2.upper var2.mean var2.lower var3.upper var3.mean var3.lower
# <fctr> <fctr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 A A 13 56.824819 6.0 -44.824819 11.85310 5.500000 -0.8531024 26.55931 7.500000 -11.559307
# 2 A B 11 38.265512 6.5 -25.265512 50.97172 6.500000 -37.9717166 25.55931 6.500000 -12.559307
# 3 B A 11 12.956686 4.0 -4.956686 13.65205 5.666667 -2.3187188 15.07146 5.666667 -3.738127
# 4 B B 20 8.484138 6.0 3.515862 14.70619 4.666667 -5.3728564 11.31872 3.333333 -4.652052

或使用自定义 CI 函数(相同的输出)

CI2 <- function(x,ci=0.95) list(data.frame(t(CI(x, ci))))

data %>% group_by(group1, group2) %>%
dplyr::summarize(var1 = CI2(var1, ci=0.95),
var2 = CI2(var2, ci=0.95),
var3 = CI2(var3, ci=0.95),
var4 = sum(var4)) %>%
unnest (var1,var2,var3,.sep=".")

或使用转换器函数(相同的输出)

可以与任何其他返回数组的函数一起使用

vec2rowdf <- function(v) list(data.frame(t(v))) # creates a 1 row data.frame from a vector, wrapped in a list
data %>% group_by(group1, group2) %>%
dplyr::summarize(var1 = CI(var1, ci=0.95) %>% vec2rowdf,
var2 = CI(var2, ci=0.95) %>% vec2rowdf,
var3 = CI(var3, ci=0.95) %>% vec2rowdf,
var4 = sum(var4)) %>%
unnest (var1,var2,var3,.sep=".")

关于r - 展平列作为参数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46251040/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com