gpt4 book ai didi

r - 通过将列保存在列表中来跨列应用 grepl?

转载 作者:行者123 更新时间:2023-12-04 10:46:35 27 4
gpt4 key购买 nike

我有一个格式如下所示的数据框:

                  String  Keyword                           
1 Apples bananas mangoes mangoes
2 Apples bananas mangoes bananas
3 Apples bananas mangoes peach
.....

它是一个数据框(50000 多行)。我目前正在批量手动使用ifelse语句。

data$Result<- ifelse(grepl("apples",data$String,ignore.case = TRUE)==TRUE,"apples",  
ifelse(grepl("bananas",data$String,ignore.case = TRUE)==TRUE,"bananas",
ifelse(grepl("mangoes",data$String,ignore.case = TRUE)==TRUE,"mangoes","unavailable")))


String Keyword Result
Apples bananas mangoes mangoes mangoes
Apples bananas mangoes bananas bananas
Apples bananas mangoes peach unavailable

有没有办法,我可以将字符串和关键字存储在一个列表中,然后对整个列表应用 grepl?

最佳答案

这是一个结合了data.tablestringi 包的简单高效的解决方案:

library(data.table)
library(stringi)
setDT(df)[stri_detect_fixed(String, Keyword, case_insensitive = TRUE), result := Keyword]
# String Keyword result
# 1: Apples bananas mangoes mangoes mangoes
# 2: Apples bananas mangoes bananas bananas
# 3: Apples bananas mangoes peach NA

或者,data.table-only 版本:

library(data.table)
setDT(df)[, result := Keyword[grep(Keyword, String, ignore.case = TRUE)], by = .(Keyword, String)]

基准

这是针对 mapply 答案的 5e5 数据集的基准测试。 (for 循环答案还没有运行完):

set.seed(123)
df1 <- data.frame(String = rep('Apples bananas mangoes', 5e5),
Keyword = sample(c("mangoes", "bananas", "peach"), 5e5, replace = TRUE))


system.time(df1$result2 <- ifelse(mapply(grepl,df1$Keyword, df1$String, ignore.case = TRUE), as.character(df1$Keyword), "Unavailable"))
# user system elapsed
# 40.78 0.02 41.12
system.time(setDT(df1)[stri_detect_fixed(String, Keyword, case_insensitive = TRUE), result3 := Keyword])
# user system elapsed
# 0.52 0.01 0.53

关于r - 通过将列保存在列表中来跨列应用 grepl?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31431280/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com