gpt4 book ai didi

r - 通过 `do` 平滑每个组

转载 作者:行者123 更新时间:2023-12-04 09:11:03 25 4
gpt4 key购买 nike

我有一些数据,其中的示例如下。我的目标是将 gam 应用于每一年,并获得另一个值,即来自 gam 模型的预测值。

fertility <- structure(list(AGE = c(15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 
23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 34L, 35L,
36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 15L, 16L, 17L, 18L,
19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L,
32L, 33L, 34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L
), Year = c(1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930,
1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930,
1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1930, 1931,
1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931,
1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931, 1931,
1931, 1931, 1931, 1931, 1931, 1931, 1931), fertility = c(5.170284269,
14.18135114, 27.69795144, 44.61216712, 59.08896308, 89.66036496,
105.4563852, 120.1754041, 137.4074262, 148.7159407, 161.5645606,
157.200515, 143.6340251, 127.8855125, 117.7343628, 159.2909484,
126.6158821, 109.0681613, 86.98223678, 70.64470361, 111.0070633,
86.15051988, 68.9204159, 55.92722274, 42.93402958, 56.84376018,
39.35337243, 26.72142573, 18.46207596, 9.231037978, 4.769704534,
13.08261815, 25.55198857, 41.15573626, 54.51090896, 81.99522459,
96.44082973, 109.9015072, 125.6603492, 136.0020892, 148.679958,
144.6639404, 132.1793638, 117.6867783, 108.345172, 144.2820726,
114.68575, 98.79142865, 78.7865069, 63.9883456, 100.217918, 77.77726461,
62.22181169, 50.49147014, 38.76112859, 52.48807067, 36.33789508,
24.67387938, 17.04740757, 8.523703784)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -60L), .Names = c("AGE",
"Year", "fertility"))

因此,非 dplyr 的“愚蠢”方式是
count <- 0
for (i in 1930:1931){
count <- count + 1
temp <- filter(fertility, Year == i)
mod <- mgcv::gam(fertility ~ s(AGE), data=temp)
pred[length(15:44) * (count - 1) + 1:30] <- predict(mod, newdata = data.frame(AGE = 15:44))
}

fertility1 <- mutate(fertility, pred = pred)

但我想要 dplyr 中的方法。我的想法是使用 do 为每列创建一个模型,然后使用 predict 获取值。我可以做的第一步,但我正在努力在 dplyr 中实现第二部分:
library(mgcv)
library(dplyr)

fertility %>%
#filter(!is.na(fertility)) %>% # not sure if this is necessary
group_by(Year) %>%
dplyr::do(model = mgcv::gam(fertility ~ s(AGE), data = .)) %>%
left_join(fertility, .) %>%
mutate(smoothed = predict(model, newdata = AGE))

我收到错误消息
Error in UseMethod("predict") : 
no applicable method for 'predict' applied to an object of class "list"

这大概意味着 dplyr 不记得 model 是一个模型,而不仅仅是一个列表元素。

最佳答案

智能 方法是使用在 mgcv 中已有多年可用的因子平滑交互,通过 by 中的 s() 术语或通过更新的 bs = "fs" 基础类型。以下是您的数据示例:

library("mgcv")
## Make Year a factor
fertility <- transform(fertility, Year = factor(Year))
## Fit model using by terms - include factor as fixed effect too!
mod <- gam(fertility ~ Year + s(AGE, by = Year), data = fertility)
## Plot to see what form this model takes
plot(mod, pages = 1)
## Some prediction data
ages <- with(fertility, seq(min(AGE), max(AGE)))
## Need to replicate this once per Year
pdat <- with(fertility,
data.frame(AGE = rep(ages, nlevels(Year)),
Year = rep(levels(Year), each = length(ages))))
## Add the fitted values to the prediction data
pdat <- transform(pdat, fitted = predict(mod, newdata = pdat))
head(pdat)

> head(pdat)
AGE Year fitted
1 15 1930 -0.8496705
2 16 1930 15.9568574
3 17 1930 33.0754019
4 18 1930 50.7419122
5 19 1930 68.9116594
6 20 1930 87.1306489

但是,如果您只想预测 AGES 的观察值,您可以只要求拟合值:
fertility <- transform(fertility, fitted = predict(mod))
head(fertility)

> head(fertility)
AGE Year fertility fitted
1 15 1930 5.170284 -0.8496705
2 16 1930 14.181351 15.9568574
3 17 1930 27.697951 33.0754019
4 18 1930 44.612167 50.7419122
5 19 1930 59.088963 68.9116594
6 20 1930 89.660365 87.1306489

您还可以查看特定因子平滑基类型 bs = "fs"?smooth.terms?factor.smooth.interaction 以了解详细信息;基本上,如果您有很多级别,但您希望每个级别的平滑器具有相同的平滑参数值,这些都是有效的。

这里的主要优点是您可以使用所有数据并拟合单个模型,然后您可以通过多种方式对其进行询问,如果您拟合 m 个单独的模型,则这些方式并不容易向您开放,例如能够调查每个平滑器的差异年。

关于r - 通过 `do` 平滑每个组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30339896/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com