gpt4 book ai didi

删除向量中属于另一个向量子串的元素

转载 作者:行者123 更新时间:2023-12-02 18:33:02 26 4
gpt4 key购买 nike

有没有更好的方法来实现这个目标?我想从此向量中删除所有字符串,它们是其他元素的子字符串。

words = c("please can you", 
"please can",
"can you",
"how did you",
"did you",
"have you")
> words
[1] "please can you" "please can" "can you" "how did you" "did you" "have you"

library(data.table)
library(stringr)
dt = setDT(expand.grid(word1 = words, word2 = words, stringsAsFactors = FALSE))
dt[, found := str_detect(word1, word2)]
setdiff(words, dt[found == TRUE & word1 != word2, word2])
[1] "please can you" "how did you" "have you"

这可行,但似乎有点矫枉过正,我有兴趣知道一种更优雅的方法。

最佳答案

搜索 wordswords 的每个组成部分,保留出现过一次的内容:

words[colSums(sapply(words, grepl, words, fixed = TRUE)) == 1]

给予:

[1] "please can you" "how did you"    "have you"   

关于删除向量中属于另一个向量子串的元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33202113/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com