gpt4 book ai didi

r - 无法在 dplyr 中使用多字变量,还是我遗漏了什么?

转载 作者:行者123 更新时间:2023-12-02 10:35:14 27 4
gpt4 key购买 nike

与 beta.linalool 相比,为什么 dplyr 不喜欢我的函数中的“beta linalool”格式?

我花了几个小时进行故障排除才找出问题所在。有没有一种方法可以使用变量被标记为多个单词的数据,或者我应该将所有内容都移至 beta.linalool 类型格式?

我学到的一切都来自Programming with dplyr

library(ggplot2)
library(readxl)
library(dplyr)
library(magrittr)

Data3<- read_excel("Desktop/Data3.xlsx")

Data3 %>% filter(Variety=="CS 420A"&`Red Blotch`=="-")%>% group_by(`Time Point`)%>%
summarise(m=mean(`beta linalool`),SD=sd(`beta linalool`))
# A tibble: 4 x 3
`Time Point` m SD
<chr> <dbl> <dbl>
1 End 0.00300 0.000117
2 Mid 0.00385 0.000353
3 Must 0.000254 0.00000633
4 Start 0.000785 0.000283

现在当我把它变成一个函数时:

cwine<-function(df,v,rb,c){
c<-enquo(c)
df %>% filter(Variety==v&`Red Blotch`==rb)%>%
group_by(`Time Point`) %>%
summarise_(m=mean(!!c),SD=sd(!!c)) %>%
}
cwine(Data3,"CS 420A","-",'beta linalool')
# A tibble: 4 x 3
`Time Point` m SD
<chr> <dbl> <dbl>
1 End NA NA
2 Mid NA NA
3 Must NA NA
4 Start NA NA
Warning messages:
1: In mean.default(~"beta linalool") :
argument is not numeric or logical: returning NA #this statement is repeated 4 more times
5: In var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) :
NAs introduced by coercion #this statement is repeated 4 more times

问题在于 beta 芳樟醇被输入为“beta linalool”。我通过在虹膜数据集上尝试这种方法并发现 Petal.Length 不是“花瓣宽度”来解决这个问题:

my_function<-function(ds,x,y,c){
c<-enquo(c)
ds %>%filter(Sepal.Length>x&Sepal.Width<y) %>%
group_by(Species) %>%
summarise(m=mean(!!c),SD=sd(!!c))
}
my_function2(iris,5,4,Petal.Length)
# A tibble: 3 x 3
Species m SD
<fct> <dbl> <dbl>
1 setosa 1.53 0.157
2 versicolor 4.32 0.423
3 virginica 5.57 0.536

事实上,我的函数在不同的变量上运行良好:

> cwine(Data2,"CS 420A","-",nerol)
# A tibble: 4 x 3
`Time Point` m SD
<chr> <dbl> <dbl>
1 End 0.000453 0.0000338
2 Mid 0.000659 0.0000660
3 Must 0.000560 0.0000234
4 Start 0.000927 0.0000224

dplyr 真的那么敏感吗?还是我遗漏了什么?

最佳答案

一个选项是将其转换为 symbol 并对其进行评估

library(tidyverse)
cwine <- function(df,v,rb,c){

df %>%
filter(Variety==v & `Red Blotch` == rb)%>%
group_by(`Time Point`) %>%
summarise(m = mean(!!rlang::sym(c)),
SD = sd(!! rlang::sym(c)))
}

cwine(Data3,"CS 420A","-",'beta linalool')
# A tibble: 2 x 3
# `Time Point` m SD
# <int> <dbl> <dbl>
#1 2 -2.11 2.23
#2 4 0.0171 NA
<小时/>

此外,如果我们想通过转换为 quosure (enquo) 来传递它,当我们传递带有反引号的变量名称时,它就可以工作(通常,不带引号的版本可以工作,但这里有一个空格)在单词之间并按原样评估它,需要反引号)

cwine <- function(df,v,rb,c){
c1 <- enquo(c)
df %>%
filter(Variety==v & `Red Blotch` == rb)%>%
group_by(`Time Point`) %>%
summarise(m = mean(!! c1 ),
SD = sd(!! c1))
}

cwine(Data3,"CS 420A","-",`beta linalool`)
# A tibble: 2 x 3
# `Time Point` m SD
# <int> <dbl> <dbl>
#1 2 -2.11 2.23
#2 4 0.0171 NA

数据

set.seed(24)
Data3 <- tibble(Variety = sample(c("CS 420A", "CS 410A"), 20, replace = TRUE),
`Red Blotch` = sample(c("-", "+"), 20, replace = TRUE),
`Time Point` = sample(1:4, 20, replace = TRUE),
`beta linalool` = rnorm(20))

关于r - 无法在 dplyr 中使用多字变量,还是我遗漏了什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55989035/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com