gpt4 book ai didi

r - mgcv GAM : more than one variable in `by` argument (smooth varying by more than 1 factor)

转载 作者:行者123 更新时间:2023-12-05 02:40:14 27 4
gpt4 key购买 nike

我需要对不止一个因素的平滑项建模。 by 参数允许我为每个因子水平建立一个平滑模型,但我找不到如何在多个因子上做到这一点。

我尝试了类似于以下的解决方案,但没有成功:

data <- iris
data$factor2 <- rep(c("A", "B"), 75)

mgcv::gam(Sepal.Length ~ s(Petal.Length, by = c(Species, factor2)), data = data)
#> Error in model.frame.default(formula = Sepal.Length ~ 1 + Petal.Length + : variable lengths differ (found for 'c(Species, factor2)')

reprex package 创建于 2021-08-05 (v2.0.0)

欢迎任何帮助!

最佳答案

interaction() 产生的问题之一是它改变了模型的矩阵,这意味着模型数据中包含的一些变量发生了变化:

m <- mgcv::gam(body_mass_g ~ s(flipper_length_mm, by = interaction(species, sex)), data = palmerpenguins::penguins)
head(insight::get_data(m))
#> body_mass_g flipper_length_mm species sex
#> 1 3750 181 Adelie.male male
#> 2 3800 186 Adelie.female female
#> 3 3250 195 Adelie.female female
#> 5 3450 193 Adelie.female female
#> 6 3650 190 Adelie.male male
#> 7 3625 181 Adelie.female female

reprex package 创建于 2021-08-06 (v2.0.1)

这可能会导致在使用后处理函数时出现一些问题,例如用于可视化。

但是,根据 Gavin 和 IRTFM 的回答,可以通过将变量作为固定效应添加到模型中轻松解决这个问题。

这是一个演示,还说明了两个单独的平滑和交互之间的差异:

library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 4.0.5

set.seed(1)

# Create data
data <- data.frame(x = rep(seq(-10, 10, length.out = 500), 2),
fac1 = as.factor(rep(c("A", "B", "C"), length.out = 1000)),
fac2 = as.factor(rep(c("X", "Y"), each = 500)))
data$y <- data$x^2 + rnorm(nrow(data), sd = 5)
data$y[data$fac1 == "A"] <- sign(data$x[data$fac1 == "A"]) * data$y[data$fac1 == "A"] + 50
data$y[data$fac1 == "B"] <- datawizard::change_scale(data$y[data$fac1 == "B"]^3, c(-50, 100))
data$y[data$fac2 == "X" & data$fac1 == "C"] <- data$y[data$fac2 == "X" & data$fac1 == "C"] - 100
data$y[data$fac2 == "X" & data$fac1 == "B"] <- datawizard::change_scale(data$y[data$fac2 == "X" & data$fac1 == "B"] ^ 2, c(-50, 100))
data$y[data$fac2 == "X" & data$fac1 == "A"] <- datawizard::change_scale(data$y[data$fac2 == "X" & data$fac1 == "A"] * -3, c(0, 100))

# Real trends
ggplot(data, aes(x = x, y = y, color = fac1, shape = fac2)) +
geom_point()

# Two smooths
m <- mgcv::gam(y ~ fac1 * fac2 + s(x, by = fac1) + s(x, by = fac2), data = data)
plot(modelbased::estimate_relation(m, length = 100, preserve_range = F))

# Interaction
m <- mgcv::gam(y ~ fac1 * fac2 + s(x, by = interaction(fac1, fac2)), data = data)
plot(modelbased::estimate_relation(m, length = 100, preserve_range = F))

reprex package 创建于 2021-08-06 (v2.0.1)

最后一个模型设法恢复每个因素组合的趋势。

关于r - mgcv GAM : more than one variable in `by` argument (smooth varying by more than 1 factor),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68659805/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com