gpt4 book ai didi

r - R 中的字符串模式操作

转载 作者:行者123 更新时间:2023-12-04 12:08:05 27 4
gpt4 key购买 nike

我正在尝试从 R 中的一堆文本中查找主机名和访客名。

示例文本 -

dat = data.frame(Series = c('England in Australia ODI Match',
'Prudential Trophy (Australia in England)',
'Pakistan in New Zealand ODI Match',
'Prudential Trophy (New Zealand in England)',
'Prudential Trophy (West Indies in England)',
'Australia in New Zealand ODI Series',
'Texaco Trophy (Australia in England)'))

我想创建两个新列。所需的输出如下所示 -

Visitor     Host
England Australia
Australia England
Pakistan New Zealand
New Zealand England
West Indies England
Australia New Zealand

我正在尝试以下功能,但它不完整。

dat$Host = sub(" in.*", "", dat$Series)

最佳答案

这里有一些东西可以做你想做的事:

re = regexpr("((New |West )?\\w+) in ((New |West )?\\w+)", dat$Series)
rm = regmatches(dat$Series, re)
d = do.call(rbind,strsplit(rm, " in "))
colnames(d) = c("Visitor","Host")

输出:

     Visitor       Host         
[1,] "England" "Australia"
[2,] "Australia" "England"
[3,] "Pakistan" "New Zealand"
[4,] "New Zealand" "England"
[5,] "West Indies" "England"
[6,] "Australia" "New Zealand"
[7,] "Australia" "England"

关于r - R 中的字符串模式操作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32813284/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com