gpt4 book ai didi

r - 使用带有嵌套列表的映射

转载 作者:行者123 更新时间:2023-12-02 18:19:43 25 4
gpt4 key购买 nike

我正在努力正确使用图书馆 purrr 中的 map 。我想通过将常见观察结果嵌套在列表中然后使用 map() 来计算样本的加权平均值。 (我知道这也适用于 group_by)

MWE:假设我观察了 3 个不同的受试者(用“id”表示),我有他们的样本权重(“weights”)和相应的观察结果(“obs”)。

df <- tibble(id = c(1, 1, 2, 2, 3,3), weights = c(0.3,0.7,0.25,0.75,0.14,0.86), obs = 6:1)
df
# A tibble: 6 x 3
id weights obs
<dbl> <dbl> <int>
1 1 0.3 6
2 1 0.7 5
3 2 0.25 4
4 2 0.75 3
5 3 0.14 2
6 3 0.86 1

我想计算每个主题的加权平均值。因此,我嵌套了权重和观察结果。

df %>% nest(data = c(weights, obs))
# A tibble: 3 x 2
id data
<dbl> <list>
1 1 <tibble [2 x 2]>
2 2 <tibble [2 x 2]>
3 3 <tibble [2 x 2]>

现在我想使用映射将函数应用于数据的每个元素。更准确地说,我尝试按以下方式解决它

df %>% nest(data = c(weights, obs)) %>% map(data, ~ (.x$weights*.x$obs)/sum(.x$weights))

Warning in .f(.x[[i]], ...) : data set ‘.x[[i]]’ not found
Warning in .f(.x[[i]], ...) :
data set ‘~(.x$weights * .x$obs)/sum(.x$weights)’ not found
Warning in .f(.x[[i]], ...) : data set ‘.x[[i]]’ not found
Warning in .f(.x[[i]], ...) :
data set ‘~(.x$weights * .x$obs)/sum(.x$weights)’ not found

正如您所看到的,这会导致大量错误消息。为了更好地理解 map ,我尝试将每个ID的权重向量乘以2。

df %>% nest(data = c(weights, obs)) %>% map(data, ~ .x$weights*2)
$id
[1] ".x[[i]]" "~.x$weights * 2"

$data
[1] ".x[[i]]" "~.x$weights * 2"

Warning messages:
1: In .f(.x[[i]], ...) : data set ‘.x[[i]]’ not found
2: In .f(.x[[i]], ...) : data set ‘~.x$weights * 2’ not found
3: In .f(.x[[i]], ...) : data set ‘.x[[i]]’ not found
4: In .f(.x[[i]], ...) : data set ‘~.x$weights * 2’ not found

df %>% nest(data = c(weights, obs)) %>% map(data, function(x) x$weights*2)
Warning in .f(.x[[i]], ...) : data set ‘.x[[i]]’ not found
Warning in .f(.x[[i]], ...) :
data set ‘function(x) x$weights * 2’ not found
Warning in .f(.x[[i]], ...) : data set ‘.x[[i]]’ not found
Warning in .f(.x[[i]], ...) :
data set ‘function(x) x$weights * 2’ not found
$id
[1] ".x[[i]]" "function(x) x$weights * 2"

$data
[1] ".x[[i]]" "function(x) x$weights * 2"

所以我在这里也收到错误消息。即使阅读了 map 的文档后我也很迷失。我没有看到我的错误。我很高兴能得到任何见解!

非常感谢!

最佳答案

我们可以在 mutate 中传递 map,因为 data 列在数据外部不可访问,除非我们使用 .$数据

library(dplyr)
library(purrr)
df %>%
nest(data = c(weights, obs)) %>%
mutate(wtd_mean = map_dbl(data, ~ sum(.x$weights*.x$obs)/sum(.x$weights)))

-输出

# A tibble: 3 × 3
id data wtd_mean
<dbl> <list> <dbl>
1 1 <tibble [2 × 2]> 5.3
2 2 <tibble [2 × 2]> 3.25
3 3 <tibble [2 × 2]> 1.14

还有来自 stats (base R) 的 weighted.mean 函数

df %>% 
nest(data = c(weights, obs)) %>%
mutate(wtd_mean = map_dbl(data, ~ weighted.mean(.x$obs, .x$weights)))
# A tibble: 3 × 3
id data wtd_mean
<dbl> <list> <dbl>
1 1 <tibble [2 × 2]> 5.3
2 2 <tibble [2 × 2]> 3.25
3 3 <tibble [2 × 2]> 1.14

关于r - 使用带有嵌套列表的映射,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71037019/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com