gpt4 book ai didi

r - 将一组中的一个值提取到新列

转载 作者:行者123 更新时间:2023-12-02 02:38:06 24 4
gpt4 key购买 nike

我正在尝试提取长格式数据结构中的一组值,并将其展开到一列中。最好用一个例子来解释。请参阅下面的示例数据。在本例中,我想提取 c 的值,并根据数据中存在的分组将其复制到新列中。

我正在寻找一种优雅的方式来实现这一目标,特别是 tidyverse 解决方案将是理想的。

  year month location group value
2019 1 top a 1
2019 1 top b 2
2019 1 top c 3
2019 1 bottom a 4
2019 1 bottom b 5
2019 1 bottom c 6
2019 2 top a 7
2019 2 top b 8
2019 2 top c 9
2019 2 bottom a 10
2019 2 bottom b 11
2019 2 bottom c 12

这是预期的输出:

  year month location group value c_value
2019 1 top a 1 3
2019 1 top b 2 3
2019 1 top c 3 3
2019 1 bottom a 4 6
2019 1 bottom b 5 6
2019 1 bottom c 6 6
2019 2 top a 7 9
2019 2 top b 8 9
2019 2 top c 9 9
2019 2 bottom a 10 12
2019 2 bottom b 11 12
2019 2 bottom c 12 12

数据:

structure(list(year = c(2019, 2019, 2019, 2019, 2019, 2019, 2019, 
2019, 2019, 2019, 2019, 2019), month = c(1, 1, 1, 1, 1, 1, 2,
2, 2, 2, 2, 2), location = c("top", "top", "top", "bottom", "bottom",
"bottom", "top", "top", "top", "bottom", "bottom", "bottom"),
group = c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a",
"b", "c"), value = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
)), row.names = c(NA, -12L), class = c("tbl_df", "tbl", "data.frame"
))

编辑:

我确实想出了一个由两部分组成的解决方案,但我仍然认为有更好的方法。

lookup <- df %>%
group_by(year, month, location) %>%
filter(group == "c") %>%
summarize(c_value = value)


df %>%
left_join(lookup, by = c("year", "month", "location"))

最佳答案

dat %>%
group_by(year, month, location) %>%
mutate(c_value = value[group == "c"][1]) %>%
ungroup()
# # A tibble: 12 x 6
# year month location group value c_value
# <dbl> <dbl> <chr> <chr> <dbl> <dbl>
# 1 2019 1 top a 1 3
# 2 2019 1 top b 2 3
# 3 2019 1 top c 3 3
# 4 2019 1 bottom a 4 6
# 5 2019 1 bottom b 5 6
# 6 2019 1 bottom c 6 6
# 7 2019 2 top a 7 9
# 8 2019 2 top b 8 9
# 9 2019 2 top c 9 9
# 10 2019 2 bottom a 10 12
# 11 2019 2 bottom b 11 12
# 12 2019 2 bottom c 12 12

附加的[1]可以防止两种错误情况:

  1. “c”未找到:

    dat %>%
    group_by(year, month, location) %>%
    mutate(c_value = value[group == "d"]) %>%
    ungroup()
    # Error: Problem with `mutate()` input `c_value`.
    # x Input `c_value` can't be recycled to size 3.
    # i Input `c_value` is `value[group == "d"]`.
    # i Input `c_value` must be size 3 or 1, not 0.
    # i The error occured in group 1: year = 2019, month = 1, location = "bottom".
  2. 发现多个“c”:

    dat$group[2] <- "c"
    dat %>%
    group_by(year, month, location) %>%
    mutate(c_value = value[group == "c"]) %>%
    ungroup()
    # Error: Problem with `mutate()` input `c_value`.
    # x Input `c_value` can't be recycled to size 3.
    # i Input `c_value` is `value[group == "c"]`.
    # i Input `c_value` must be size 3 or 1, not 2.

两者都通过[1]进行缓解,尽管第二个被默默地截断了。原始数据:

dat %>%
group_by(year, month, location) %>%
mutate(c_value = value[group == "d"][1]) %>%
ungroup()
# # A tibble: 12 x 6
# year month location group value c_value
# <dbl> <dbl> <chr> <chr> <dbl> <dbl>
# 1 2019 1 top a 1 NA
# 2 2019 1 top b 2 NA
# 3 2019 1 top c 3 NA
# 4 2019 1 bottom a 4 NA
# 5 2019 1 bottom b 5 NA
# 6 2019 1 bottom c 6 NA
# 7 2019 2 top a 7 NA
# 8 2019 2 top b 8 NA
# 9 2019 2 top c 9 NA
# 10 2019 2 bottom a 10 NA
# 11 2019 2 bottom b 11 NA
# 12 2019 2 bottom c 12 NA

dat$group[2] <- "c"
dat %>%
group_by(year, month, location) %>%
mutate(c_value = value[group == "c"][1]) %>%
ungroup()
# # A tibble: 12 x 6
# year month location group value c_value
# <dbl> <dbl> <chr> <chr> <dbl> <dbl>
# 1 2019 1 top a 1 2
# 2 2019 1 top c 2 2
# 3 2019 1 top c 3 2
# 4 2019 1 bottom a 4 6
# 5 2019 1 bottom b 5 6
# 6 2019 1 bottom c 6 6
# 7 2019 2 top a 7 9
# 8 2019 2 top b 8 9
# 9 2019 2 top c 9 9
# 10 2019 2 bottom a 10 12
# 11 2019 2 bottom b 11 12
# 12 2019 2 bottom c 12 12

left_join 的替代方案要短一些:

filter(dat, group == "c") %>%
select(-group, c_value = value) %>%
left_join(dat, ., by = c("year", "month", "location"))

关于r - 将一组中的一个值提取到新列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64033341/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com