gpt4 book ai didi

r - 计算 r 中一组单词的出现次数

转载 作者:行者123 更新时间:2023-12-02 16:18:04 25 4
gpt4 key购买 nike

假设我有一个数据集,例如:

Col1
Mon,Tues,Wed,Thurs,Fri
Mon,Tues,Wed,Thurs
Mon,Tues,Wed
Mon,Tues
Thurs

我想通过计算一组单词来给每一行打分。假设我有这组单词:Mon、Tues、Wed

如何制作具有相应分数的列?这将导致:

Scores
3
3
3
2
0

提前谢谢您!

最佳答案

这是一个基本的 R 解决方案:

words <- c("Mon", "Tues", "Wed");
sapply(strsplit(as.character(df$Col), ","), function(x) sum(x %in% words))
#[1] 3 3 3 2 0

或者存储在分数列中:

df$Scores <- sapply(strsplit(as.character(df$Col), ","), function(x) sum(x %in% words));
df;
# Col1 Scores
#1 Mon,Tues,Wed,Thurs,Fri 3
#2 Mon,Tues,Wed,Thurs 3
#3 Mon,Tues,Wed 3
#4 Mon,Tues 2
#5 Thurs 0

或者使用transformpurrr::map_int

library(purrr);
transform(df, Scores = map_int(Col1, function(x)
sum(unlist(strsplit(as.character(x), ",")) %in% words)))
# Col1 Scores
#1 Mon,Tues,Wed,Thurs,Fri 3
#2 Mon,Tues,Wed,Thurs 3
#3 Mon,Tues,Wed 3
#4 Mon,Tues 2
#5 Thurs 0

示例数据

df <- read.table(text =
"Col1
Mon,Tues,Wed,Thurs,Fri
Mon,Tues,Wed,Thurs
Mon,Tues,Wed
Mon,Tues
Thurs", header = T)

关于r - 计算 r 中一组单词的出现次数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49848345/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com