gpt4 book ai didi

从字符串中删除字符向量中的单词

转载 作者:行者123 更新时间:2023-12-04 02:39:34 25 4
gpt4 key购买 nike

我在 R 中有一个停用词的字符向量:

stopwords = c("a" ,
"able" ,
"about" ,
"above" ,
"abst" ,
"accordance" ,
...
"yourself" ,
"yourselves" ,
"you've" ,
"z" ,
"zero")

假设我有字符串:
str <- c("I have zero a accordance")
如何从 str 中删除我定义的停用词?

我想 gsub或其他 grep工具可能是实现这一目标的一个很好的候选者,尽管欢迎其他建议。

最佳答案

尝试这个:

str <- c("I have zero a accordance")

stopwords = c("a", "able", "about", "above", "abst", "accordance", "yourself",
"yourselves", "you've", "z", "zero")

x <- unlist(strsplit(str, " "))

x <- x[!x %in% stopwords]

paste(x, collapse = " ")

# [1] "I have"

添加:编写“removeWords”函数很简单,因此没有必要为此加载外部包:
removeWords <- function(str, stopwords) {
x <- unlist(strsplit(str, " "))
paste(x[!x %in% stopwords], collapse = " ")
}

removeWords(str, stopwords)
# [1] "I have"

关于从字符串中删除字符向量中的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35790652/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com