gpt4 book ai didi

r - 如何将 spline() 插值的 x 范围限制为 dplyr 中的第一个和最后一个非 NA 值?

转载 作者:行者123 更新时间:2023-12-02 04:32:53 25 4
gpt4 key购买 nike

我想使用 dplyr、管道和 spline() 插入缺失值。

数据:

test <- structure(list(site = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c("lake", "stream", "wetland"
), class = "factor"), depth = c(0L, -3L, -4L, -8L, -10L, -14L,
0L, -1L, -3L, -5L, 0L, -2L, -4L, -6L), var1 = c(NA, 1L, 3L, NA,
6L, NA, 1L, 2L, NA, 4L, 1L, NA, NA, 4L), var2 = c(1L, NA, 3L,
4L, 8L, NA, NA, NA, NA, NA, NA, 2L, NA, NA)), .Names = c("site",
"depth", "var1", "var2"), class = "data.frame", row.names = c(NA,
-14L))

问题 1:如何使用以下功能代码,但将插值范围限制在第一个非 NA 值和最后一个非 NA 值之间对于每个变量。例如,它应该只在深度 -8 处为 wetland 插入 var1 并为深度 返回 NA 0-14

library(tidyverse)

test_int <- test %>%
group_by(site) %>%
mutate_at(vars(c(var1, var2)),
funs("i" = if(sum(!is.na(.)) > 1)
spline(x=depth, y=., xout=depth)[["y"]]
else
NA))

问题 2:有没有办法将我的内插值从 0 绑定(bind)到 Inf?还是这不适合样条曲线(例如,我应该使用另一种插值方法,例如 smoothloess)?

最佳答案

不漂亮,但能够过滤掉多余的值。副作用是它也会过滤掉超出 minmax 限制的插值。

test_clean <- 
test %>%
group_by(site) %>%
mutate_at(vars(c(var1, var2)),
funs(c("c" = if(sum(!is.na(.)) > 1)
spline(x=depth, y=., xout=depth)[["y"]]
else NA),
"min" = min(., na.rm = TRUE),
"max" = max(., na.rm = TRUE)
)
) %>%
mutate(var1_i = if_else(var1_c >= var1_min & var1_c <= var1_max, var1_c, NA_real_),
var2_i = if_else(var2_c >= var2_min & var2_c <= var2_max, var2_c, NA_real_)) %>%
select(site:var2, ends_with("i"))

test_clean
# A tibble: 14 x 6
# Groups: site [3]
site depth var1 var2 var1_i var2_i
<fctr> <int> <int> <int> <dbl> <dbl>
1 wetland 0 NA 1 NA 1.000000
2 wetland -3 1 NA 1.0 3.078125
3 wetland -4 3 3 3.0 3.000000
4 wetland -8 NA 4 NA 4.000000
5 wetland -10 6 8 6.0 8.000000
6 wetland -14 NA NA NA NA
7 lake 0 1 NA 1.0 NA
8 lake -1 2 NA 2.0 NA
9 lake -3 NA NA 3.4 NA
10 lake -5 4 NA 4.0 NA
11 stream 0 1 NA 1.0 NA
12 stream -2 NA 2 2.0 NA
13 stream -4 NA NA 3.0 NA
14 stream -6 4 NA 4.0 NA

并且为了帮助每个致力于改进这一点或验证在到达最终数据框的过程中发生的步骤的每个人,这里是包含中间步骤的数据框:

# A tibble: 14 x 12
# Groups: site [3]
site depth var1 var2 var1_c var2_c var1_min var2_min var1_max var2_max var1_i var2_i
<fctr> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 wetland 0 NA 1 -7.5714286 1.000000 1 1 6 8 NA 1.000000
2 wetland -3 1 NA 1.0000000 3.078125 1 1 6 8 1.0 3.078125
3 wetland -4 3 3 3.0000000 3.000000 1 1 6 8 3.0 3.000000
4 wetland -8 NA 4 6.7142857 4.000000 1 1 6 8 NA 4.000000
5 wetland -10 6 8 6.0000000 8.000000 1 1 6 8 6.0 8.000000
6 wetland -14 NA NA -0.5714286 30.750000 1 1 6 8 NA NA
7 lake 0 1 NA 1.0000000 NA 1 Inf 4 -Inf 1.0 NA
8 lake -1 2 NA 2.0000000 NA 1 Inf 4 -Inf 2.0 NA
9 lake -3 NA NA 3.4000000 NA 1 Inf 4 -Inf 3.4 NA
10 lake -5 4 NA 4.0000000 NA 1 Inf 4 -Inf 4.0 NA
11 stream 0 1 NA 1.0000000 NA 1 2 4 2 1.0 NA
12 stream -2 NA 2 2.0000000 NA 1 2 4 2 2.0 NA
13 stream -4 NA NA 3.0000000 NA 1 2 4 2 3.0 NA
14 stream -6 4 NA 4.0000000 NA 1 2 4 2 4.0 NA

关于r - 如何将 spline() 插值的 x 范围限制为 dplyr 中的第一个和最后一个非 NA 值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46959574/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com