gpt4 book ai didi

r - 匹配字符串在多列上循环

转载 作者:行者123 更新时间:2023-12-04 10:10:26 25 4
gpt4 key购买 nike

我有来自开放式调查的数据。我有一个评论表和一个代码表。代码表是一组主题或字符串。

我正在尝试做的事情:
检查代码表中相关列中的单词/字符串是否存在于开放式注释中。在评论表中为特定主题添加一个新列,并使用二进制 1 或 0 来表示哪些记录已被标记。

代码表中有相当多的列,这些列是实时且不断变化的,列顺序和列数可能会发生变化。

我目前正在以一种相当复杂的方式执行此操作,我正在使用多行代码单独检查每一列,我认为可能有更好的方法。

我不知道如何让 lapply 使用 stringi 函数。

非常感谢帮助。

这是一组示例代码,因此您可以看到我正在尝试做什么:

#Two tables codes and comments
#codes table
codes <- structure(
list(
Support = structure(
c(2L, 3L, NA),
.Label = c("",
"help", "questions"),
class = "factor"
),
Online = structure(
c(1L,
3L, 2L),
.Label = c("activities", "discussion board", "quiz"),
class = "factor"
),
Resources = structure(
c(3L, 2L, NA),
.Label = c("", "pdf",
"textbook"),
class = "factor"
)
),
row.names = c(NA,-3L),
class = "data.frame"
)
#comments table
comments <- structure(
list(
SurveyID = structure(
1:5,
.Label = c("ID_1", "ID_2",
"ID_3", "ID_4", "ID_5"),
class = "factor"
),
Open_comments = structure(
c(2L,
4L, 3L, 5L, 1L),
.Label = c(
"I could never get the pdf to download",
"I didn’t get the help I needed on time",
"my questions went unanswered",
"staying motivated to get through the textbook",
"there wasn’t enough engagement in the discussion board"
),
class = "factor"
)
),
class = "data.frame",
row.names = c(NA,-5L)
)

#check if any words from the columns in codes table match comments

#here I am looking for a match column by column but looking for a better way - lappy?

support = paste(codes$Support, collapse = "|")
supp_stringi = stri_detect_regex(comments$Open_comments, support)
supp_grepl = grepl(pattern = support, x = comments$Open_comments)
identical(supp_stringi, supp_grepl)
comments$Support = ifelse(supp_grepl == TRUE, 1, 0)

# What I would like to do is loop through all columns in codes rather than outlining the above code for each column in codes

最佳答案

这是一种使用 string::stri_detect_regex() 的方法与 lapply()根据 Support 中是否有任何单词来创建 TRUE = 1、FALSE = 0 的向量, OnlineResources向量在评论中,并将这些数据与评论合并。

# build data structures from OP

resultsList <- lapply(1:ncol(codes),function(x){
y <- stri_detect_regex(comments$Open_comments,paste(codes[[x]],collapse = "|"))
ifelse(y == TRUE,1,0)
})

results <- as.data.frame(do.call(cbind,resultsList))
colnames(results) <- colnames(codes)
mergedData <- cbind(comments,results)
mergedData

...和结果。
> mergedData
SurveyID Open_comments Support Online
1 ID_1 I didn’t get the help I needed on time 1 0
2 ID_2 staying motivated to get through the textbook 0 0
3 ID_3 my questions went unanswered 1 0
4 ID_4 there wasn’t enough engagement in the discussion board 0 1
5 ID_5 I could never get the pdf to download 0 0
Resources
1 0
2 1
3 0
4 0
5 1
>

关于r - 匹配字符串在多列上循环,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61354445/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com