gpt4 book ai didi

r - 从字符串向量数据中提取单词字符串

转载 作者:行者123 更新时间:2023-12-02 07:57:26 24 4
gpt4 key购买 nike

我有一个字符串 vector 数据,如下所示

Data
Posted by Mohit Garg on May 7, 2016
Posted by Dr. Lokesh Garg on April 8, 2018
Posted by Lokesh.G.S on June 11, 2001
Posted by Mohit.G.S. on July 23, 2005
Posted by Dr.Mohit G Kumar Saha on August 2, 2019

我已经使用了str_extract()函数作为
str_extract(Data, "Posted by \\w+. \\w+ \\w+")

它生成的输出为
[1] "Posted by Mohit Garg on"   "Posted by Dr. Lokesh Garg" NA                         
[4] NA NA

我希望输出应该像
[1] "Posted by Mohit Garg on"   "Posted by Dr. Lokesh Garg"  "Posted by Lokesh.G.S"                       
[4] "Posted by Mohit.G.S." "Posted by Dr.Mohit G Kumar Saha"

最佳答案

可能您可以尝试:

stringr::str_extract(df$Data, "Posted by .+?(?=\\s+on)")

#[1] "Posted by Mohit Garg" "Posted by Dr. Lokesh Garg" "Posted by Lokesh.G.S"
#[4] "Posted by Mohit.G.S." "Posted by Dr.Mohit G Kumar Saha"

这将从 "Posted by""on"中提取所有内容(不包括 "on")。

在R中相同:
sub(".*(Posted by .+?)(?=\\s+on).*", '\\1', df$Data, perl = TRUE) 

数据
df <- structure(list(Data = c("Posted by Mohit Garg on May 7, 2016", 
"Posted by Dr. Lokesh Garg on April 8, 2018", "Posted by Lokesh.G.S on June 11, 2001",
"Posted by Mohit.G.S. on July 23, 2005", "Posted by Dr.Mohit G Kumar Saha on August 2, 2019"
)), class = "data.frame", row.names = c(NA, -5L))

关于r - 从字符串向量数据中提取单词字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62015531/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com