gpt4 book ai didi

r - 带有子组的组的反向引用编号

转载 作者:行者123 更新时间:2023-12-02 11:15:29 25 4
gpt4 key购买 nike

我有“粉丝”一词,当前面带有代词动词组合时,我想用“狂热”一词替换,如下所示。

gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\\b[Ff]an)(s?\\b)",
'\\1\\2atic\\3',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)

## [1] "He's the bigest He'saticHe's I know."

我知道编号的反向引用是指第一组的内括号。有没有办法让它们只引用外面的三个括号,其中三个组是:(fan 之前的东西)(fan)(s\\b) 伪代码。

我知道我的正则表达式可以替换所有组,因为我知道它是有效的。这只是反向引用部分。

gsub(
"(((s?he( i|')s)|((you|they|we)( a|')re)|(I( a|')m)).{1,20})(\\b[Ff]an)(s?\\b)",
'',
'He\'s the bigest fan I know.',
perl = TRUE, ignore.case = TRUE
)

## [1] " I know."

期望的输出:

## [1] "He's the bigest fanatic I know."

匹配示例

inputs <- c(
"He's the bigest fan I know.",
"I am a huge fan of his.",
"I know she has lots of fans in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)


outputs <- c(
"He's the bigest fanatic I know.",
"I am a huge fanatic of his.",
"I know she has lots of fanatics in his club",
"I was cold and turned on the fan",
"An air conditioner is better than 2 fans at cooling."
)

最佳答案

我知道您在捕获组数量过多时遇到麻烦。把你不感兴趣的变成non-capturing ,或删除那些明显多余的:

((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\b(Fan)(s?)\b

请参阅regex demo

请注意,[Ff] 可以变成 Ff,因为您使用 ignore.case=TRUE 参数。

R demo :

gsub(
"((?:s?he(?: i|')s|(?:you|they|we)(?: a|')re|I(?: a|')m).{1,20})\\b(fan)(s?)\\b",
'\\1\\2atic\\3',
inputs,
perl = TRUE, ignore.case = TRUE
)

输出:

[1] "He's the bigest fanatic I know."                     
[2] "I am a huge fanatic of his."
[3] "I know she has lots of fans in his club"
[4] "I was cold and turned on the fan"
[5] "An air conditioner is better than 2 fans at cooling."

关于r - 带有子组的组的反向引用编号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52488329/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com