gpt4 book ai didi

r - 我可以结合使用 dplyr mutate_at 和 mutate_if 语句吗?

转载 作者:行者123 更新时间:2023-12-04 15:29:47 24 4
gpt4 key购买 nike

我有以下示例输出:

   country country-year year     a     b
1 France France2000 2000 NA NA
2 France France2001 2001 1000 1000
3 France France2002 2002 NA NA
4 France France2003 2003 1600 2200
5 France France2004 2004 NA NA
6 UK UK2000 2000 1000 1000
7 UK UK2001 2001 NA NA
8 UK UK2002 2002 1000 1000
9 UK UK2003 2003 NA NA
10 UK UK2004 2004 NA NA
11 Germany UK2000 2000 NA NA
12 Germany UK2001 2001 NA NA
13 Germany UK2002 2002 NA NA
14 Germany UK2003 2003 NA NA
15 Germany UK2004 2004 NA NA

我想插入我的数据(但不是外推),并删除列 a 的列和 b都是 NA。换句话说,我想删除所有无法插入的列;在示例中:
1  France  France2000        NA    NA
5 France France2004 NA NA
9 UK UK2003 NA NA
10 UK UK2004 NA NA
11 Germany UK2000 NA NA
12 Germany UK2001 NA NA
13 Germany UK2002 NA NA
14 Germany UK2003 NA NA
15 Germany UK2004 NA NA

有两个选项几乎可以满足我的要求:
library(tidyverse)
library(zoo)
df %>%
group_by(country) %>%
mutate_at(vars(a:b),~na.fill(.x,c(NA, "extend", NA))) %>%
filter(!is.na(a) | !is.na(b))


df%>% 
group_by(Country)%>%
mutate_if(is.numeric,~if(all(is.na(.x))) NA else na.fill(.x,"extend"))

是否可以组合这些代码,做这样的事情:
df <- df%>%
group_by(country)%>%
mutate_at(vars(a:b),~if(all(is.na(.x))) NA else(.x,c(NA, "extend", NA)))
filter(!is.na(df$a | df$a))

期望的输出:
   country country-year    a     b 
2 France France2001 1000 1000
3 France France2002 1300 1600
4 France France2003 1600 2200
6 UK UK2000 1000 1000
7 UK UK2001 0 0
8 UK UK2002 1000 1000

最佳答案

我知道这并没有直接回答如何结合 mutate_if 的问题和 mutate_at ,但这解决了您的一般问题:

我首先去掉所有 a 和 b 都缺失的国家,然后为每个国家确定最小和最大年份,这是没有缺失的。过滤这些后,我使用 na.fill .

library(dplyr)
library(readr)
library(zoo)

country_data %>%
mutate(Year = parse_number(`country-year`)) %>%
group_by(country) %>%
mutate(not_all_na = any(!(is.na(a) & is.na(b)))) %>%
filter(not_all_na) %>%
mutate(Year_min_not_na = min(Year[!(is.na(a) & is.na(b))]),
Year_max_not_na = max(Year[!(is.na(a) & is.na(b))])) %>%
filter(Year >= Year_min_not_na, Year <= Year_max_not_na) %>%
mutate_at(vars(a:b), ~na.fill(.x, "extend"))

# A tibble: 6 x 8
# Groups: country [2]
# country `country-year` a b Year not_all_na Year_min_not_na Year_max_not_na
# <fct> <fct> <dbl> <dbl> <dbl> <lgl> <dbl> <dbl>
# 1 France France2001 1000 1000 2001 TRUE 2001 2003
# 2 France France2002 1300 1600 2002 TRUE 2001 2003
# 3 France France2003 1600 2200 2003 TRUE 2001 2003
# 4 UK UK2000 1000 1000 2000 TRUE 2000 2002
# 5 UK UK2001 1000 1000 2001 TRUE 2000 2002
# 6 UK UK2002 1000 1000 2002 TRUE 2000 2002

数据
country_data <- 
structure(list(country = structure(c(1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L),
.Label = c("France", "Germany", "UK"), class = "factor"),
country.year = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 6L, 7L, 8L, 9L, 10L),
.Label = c("France2000", "France2001", "France2002", "France2003",
"France2004", "UK2000", "UK2001", "UK2002", "UK2003", "UK2004"),
class = "factor"),
a = c(NA, 1000L, NA, 1600L, NA, 1000L, NA, 1000L, NA, NA, NA, NA, NA, NA, NA),
b = c(NA, 1000L, NA, 2200L, NA, 1000L, NA, 1000L, NA, NA, NA, NA, NA, NA, NA)),
class = "data.frame", row.names = c(NA, -15L))

关于r - 我可以结合使用 dplyr mutate_at 和 mutate_if 语句吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51877611/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com