gpt4 book ai didi

r - 如何将单词列表 (chr) 与数据帧中多列中的值进行比较,并在 R 中存在匹配时输出二进制响应

转载 作者:行者123 更新时间:2023-11-30 09:15:44 25 4
gpt4 key购买 nike

我想将单词列中的每个单词与V1到V576列中的值进行比较(每行逐行) 。如果 words 列中的任何单词与 V 列中的任何单词匹配,则替换相应的单词V 列加 1,如果不匹配则加 0。知道怎么做吗? 我不知道如何在所有行和列上循环

数据框称为数据words 列是一个列表($words :List of 42201)。有42201行大约有 576 列要比较的单词(V1 到 V576)。

这里只是前 3 行和前 20 列的 dput 文件。

structure(list(id = c("Te-1", "Te-2", "Te-3"), category = c("Fabric Care", 
"Fabric Care", "Home Care"), brand = c("Tide", "Tide", "Cascade"
), sub_category = c("Laundry", "Laundry", "Auto Dishwashing"),
market = c("US", "US", "US"), review_title = c("the best in a very crowded market",
"first time", "i have been using another well known brand and did not expect "
), review_text = c("the best general wash detergent convenient container that keeps the product driy ",
"this helped to clean our washing machine after getting it from someone else this review was collected as part of a promotion ",
"i have been using another well known brand and did not expect much difference wow was i ever mistaken i will never go back "
), review_rating = c(5L, 5L, 5L), words = list(c("the", "best",
"general", "wash", "deterg", "conveni", "contain", "that",
"keep", "the", "product", "driy"), c("this", "help", "to",
"clean", "our", "wash", "machin", "after", "get", "it", "from",
"someon", "els", "this", "review", "was", "collect", "as",
"part", "of", "a", "promot"), c("i", "have", "been", "use",
"anoth", "well", "known", "brand", "and", "did", "not", "expect",
"much", "differ", "wow", "was", "i", "ever", "mistaken",
"i", "will", "never", "go", "back")), V1 = c("absolut", "absolut",
"absolut"), V2 = c("action", "action", "action"), V3 = c("actionpac",
"actionpac", "actionpac"), V4 = c("actual", "actual", "actual"
), V5 = c("addit", "addit", "addit"), V6 = c("adverti", "adverti",
"adverti"), V7 = c("afford", "afford", "afford"), V8 = c("agent",
"agent", "agent"), V9 = c("allerg", "allerg", "allerg"),
V10 = c("allergi", "allergi", "allergi"), V11 = c("alon",
"alon", "alon")), row.names = c(NA, -3L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000023d166a1ef0>)

请参阅下面的数据框片段,以便更好地理解我的问题

CLICK HERE TO SEE THE DATA TABLE

非常感谢您的帮助!

最佳答案

我创建了一个数据框

数据

data <- data.frame(words = c("the, best, general","i, have, been"), v1 = c("best","no"), v2 = c("have", "nothing"), stringsAsFactors = F)

使用for循环条件,我已经传递了函数grepl,只要它匹配,它就会出现1,如果不是0

for (i in 2: ncol(data)){
for (j in 1:nrow(data)){

x <- i

y <- data$words[j]

ab <- data [j,x]

abc <- grepl (ab , y)

data[j,i] <- ifelse (abc %in% "TRUE", 1, data[j,i])

}
}

结果

print (data)
words v1 v2
the, best, general 1 0
i, have, been 0 0

关于r - 如何将单词列表 (chr) 与数据帧中多列中的值进行比较,并在 R 中存在匹配时输出二进制响应,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56463306/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com