gpt4 book ai didi

r - 根据单独列中的字符串匹配,有条件地替换多列中的值

转载 作者:行者123 更新时间:2023-12-05 02:06:40 24 4
gpt4 key购买 nike

我正在尝试根据不同列中的字​​符串匹配有条件地替换多列中的值,但我希望能够使用 across() 函数在一行代码中这样做,但我不断得到对我来说不太有意义的错误。我觉得这可能是一个简单的解决方案,所以如果有人能指出我正确的方向,那就太棒了!

df <- data.frame("type" = c("Park", "Neighborhood", "Airport", "Park", "Neighborhood", "Neighborhood"),
"total" = c(34, 56, 75, 89, 21, 56),
"group_a" = c(30, 26, 45, 60, 3, 46),
"group_b" = c(4, 30, 30, 29, 18, 10))

# working but not concise
df %>%
mutate(total = ifelse(str_detect(type, "Park"), NA, total),
group_a = ifelse(str_detect(type, "Park"), NA, group_a),
group_b = ifelse(str_detect(type, "Park"), NA, group_b))


# concise but not working
df %>% mutate(across(total, group_a, group_b), ifelse(str_detect(type, "Park"), NA, .))

更新

我们得到了一个适用于我的虚拟数据集但不适用于我的真实数据的解决方案,因此我将分享我的真实数据框的一小段,其中数字已更改且组织名称已隐藏。当我运行这行代码时 (df %>% mutate(across(c(Attempts, Canvasses, Completes)), ~ifelse(str_detect(long_name, "park-cemetery"), NA, .))) 在这些数据上,我收到以下错误消息:

Error: Problem with mutate() input ..2. x Input ..2 must be avector, not a formula object. i Input ..2 is~ifelse(str_detect(long_name, "park-cemetery"), NA, .).

这是产生此错误的一小部分数据样本:

df <- structure(list(Org = c("OrgName", "OrgName", "OrgName", "OrgName", 
"OrgName", "OrgName", "OrgName", "OrgName", "OrgName", "OrgName"
), nCode = c("M34", "R36", "R46", "X29", "M31", "K39", "Q12",
"Q39", "X41", "K27"), Attempts = c(100, 100, 100, 100, 100, 100,
100, 100, 100, 100), Canvasses = c(80, 80, 80, 80, 80, 80, 80,
80, 80, 80), Completes = c(50, 50, 50, 50, 50, 50, 50, 50, 50,
50), van_nocc_id = c(999, 999, 999, 999, 999, 999, 999, 999,
999, 999), van_name = c("M-Upper West Side", "SI-Rosebank", "SI-Tottenville",
"BX-park-cemetery-etc-Bronx", "M-Stuyvesant Town-Cooper Village",
"BK-Kensington", "Q-Broad Channel", "Q-Lindenwood", "BX-Wakefield",
"BK-East New York"), boro_short = c("M", "SI", "SI", "BX", "M",
"BK", "Q", "Q", "BX", "BK"), long_name = c("Upper West Side",
"Rosebank", "Tottenville", "park-cemetery-etc-Bronx", "Stuyvesant Town-Cooper Village",
"Kensington", "Broad Channel", "Lindenwood", "Wakefield", "East New York"
)), row.names = c(NA, -10L), class = "data.frame")

最后更新

错位右括号的诅咒!感谢大家的帮助...正确的解决方案是 df %>% mutate(across(c(Attempts, Canvasses, Completes), ~ifelse(str_detect(long_name, "park-cemetery"), NA, . )))

最佳答案

如果您使用新引入的函数across(这是处理此任务的正确方法),您必须指定inside across本身就是您要应用的功能。在这种情况下,函数 ifelse(...) 必须是 purrr 风格的 lambda(因此以 ~ 开头)。查看 across documentation并查找参数 .cols.fns

df %>% 
mutate(across(c(total, group_a, group_b), ~ifelse(str_detect(type, "Park"), NA, .)))

输出

#           type total group_a group_b
# 1 Park NA NA NA
# 2 Neighborhood 56 26 30
# 3 Airport 75 45 30
# 4 Park NA NA NA
# 5 Neighborhood 21 3 18
# 6 Neighborhood 56 46 10

关于r - 根据单独列中的字符串匹配,有条件地替换多列中的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62578575/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com