gpt4 book ai didi

r - 使用 purrr 迭代替换数据框列中的字符串

转载 作者:行者123 更新时间:2023-12-02 09:04:32 25 4
gpt4 key购买 nike

我想使用 purrr 通过 gsub() 函数在数据框列上迭代地运行多个字符串替换。

这是示例数据框:

df <- data.frame(Year = "2019",
Text = c(rep("a aa", 5),
rep("a bb", 3),
rep("a cc", 2)))

> df
Year Text
1 2019 a aa
2 2019 a aa
3 2019 a aa
4 2019 a aa
5 2019 a aa
6 2019 a bb
7 2019 a bb
8 2019 a bb
9 2019 a cc
10 2019 a cc

这就是我通常运行字符串替换的方式,以及所需的结果。

df$Text <- gsub("aa", "One", df$Text, fixed = T)
df$Text <- gsub("bb", "Two", df$Text, fixed = T)
df$Text <- gsub("cc", "Three", df$Text, fixed = T)

> df
Year Text
1 2019 a One
2 2019 a One
3 2019 a One
4 2019 a One
5 2019 a One
6 2019 a Two
7 2019 a Two
8 2019 a Two
9 2019 a Three
10 2019 a Three

但是,随着字符串替换列表的增加,使用这种方法是不现实的,所以我尝试使用 purrr 使用 patterns 列表来迭代此类更改替换 但我只设法产生错误消息。我希望代码遍历 text_patterntext_replacement 并为每对模式在 df$Text 上运行 gsub/替代品。下面是示例以及错误消息。

text_pattern <- c("aa", "bb", "cc")
text_replacement <- c("One", "Two", "Three")

walk2(text_pattern, text_replacement, function(...){
gsub(text_pattern, text_replacement, df$Text, fixed = F)
}
)

Warning messages:
1: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
2: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
3: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
4: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used
5: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'pattern' has length > 1 and only the first element will be used
6: In gsub(text_former, text_replace, df$Text, fixed = F) :
argument 'replacement' has length > 1 and only the first element will be used

是否可以使用 purrr 中的函数来完成此操作?或者我是否尝试使用错误的工具,我应该使用不同的功能吗?

最佳答案

我们可以使用reduce2

library(purrr)
library(stringr)
df$Text <- reduce2(text_pattern, text_replacement, ~ str_replace(..1, ..2, ..3),
.init = df$Text)
df$Text
#[1] "a One" "a One" "a One" "a One" "a One" "a Two" "a Two" "a Two" "a Three" "a Three"

或者不使用匿名函数调用

reduce2(text_pattern, text_replacement, .init = df$Text, str_replace)

关于r - 使用 purrr 迭代替换数据框列中的字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60046482/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com