gpt4 book ai didi

正则表达式仅用于替换括号外的特定字符

转载 作者:行者123 更新时间:2023-12-04 11:18:32 25 4
gpt4 key购买 nike

我正在寻找可以替换(任意数量)特定字符的正则表达式(最好在 R 中)说 ;;;但仅限于 不存在 括号内()文本字符串内。
注意: 1. 括号内也可能存在多个替换字符
2.数据/向量中没有嵌套括号
例子

  • text;othertext替换为 text;;othertext
  • 但是 text;other(texttt;some;someother);more替换为 text;;other(texttt;some;someother);;more . (即 ; 仅在 () 之外被替换文本替换)

  • 如果需要澄清,我会尝试解释
    in_vec <- c("abcd;ghi;dfsF(adffg;adfsasdf);dfg;(asd;fdsg);ag", "zvc;dfasdf;asdga;asd(asd;hsfd)", "adsg;(asdg;ASF;DFG;ASDF;);sdafdf", "asagf;(fafgf;sadg;sdag;a;gddfg;fd)gsfg;sdfa")

    in_vec
    #> [1] "abcd;ghi;dfsF(adffg;adfsasdf);dfg;(asd;fdsg);ag"
    #> [2] "zvc;dfasdf;asdga;asd(asd;hsfd)"
    #> [3] "adsg;(asdg;ASF;DFG;ASDF;);sdafdf"
    #> [4] "asagf;(fafgf;sadg;sdag;a;gddfg;fd)gsfg;sdfa"
    预期输出(手动计算)
    [1] "abcd;;ghi;;dfsF(adffg;adfsasdf);;dfg;;(asd;fdsg);;ag" 
    [2] "zvc;;dfasdf;;asdga;;asd(asd;hsfd)"
    [3] "adsg;;(asdg;ASF;DFG;ASDF;);;sdafdf"
    [4] "asagf;;(fafgf;sadg;sdag;a;gddfg;fd)gsfg;;sdfa"

    最佳答案

    您可以使用 gsub;(?![^(]*\\)) :

    gsub(";(?![^(]*\\))", ";;", in_vec, perl=TRUE)
    #[1] "abcd;;ghi;;dfsF(adffg;adfsasdf);;dfg;;(asd;fdsg);;ag"
    #[2] "zvc;;dfasdf;;asdga;;asd(asd;hsfd)"
    #[3] "adsg;;(asdg;ASF;DFG;ASDF;);;sdafdf"
    #[4] "asagf;;(fafgf;sadg;sdag;a;gddfg;fd)gsfg;;sdfa"
    ;发现 ; , (?!) .. Negative Lookahead(不匹配时进行替换), [^(] .. 一切,但不是 ( , *重复前面的 0 到 n 次, \\) .. 流经 ) .
    或者
    gsub(";(?=[^)]*($|\\())", ";;", in_vec, perl=TRUE)
    #[1] "abcd;;ghi;;dfsF(adffg;adfsasdf);;dfg;;(asd;fdsg);;ag"
    #[2] "zvc;;dfasdf;;asdga;;asd(asd;hsfd)"
    #[3] "adsg;;(asdg;ASF;DFG;ASDF;);;sdafdf"
    #[4] "asagf;;(fafgf;sadg;sdag;a;gddfg;fd)gsfg;;sdfa"
    ;发现 ; , (?=) .. Positive Lookahead(在匹配时进行替换), [^)] .. 一切,但不是 ) , *重复前面的 0 到 n 次, ($|\\() ..比赛结束 $( .
    或使用 gregexprregmatches提取 ( 之间的部分和 )并在不匹配的子字符串中进行替换:
    x <- gregexpr("\\(.*?\\)", in_vec)  #Find the part between ( and )
    mapply(function(a, b) {
    paste(matrix(c(gsub(";", ";;", b), a, ""), 2, byrow=TRUE), collapse = "")
    }, regmatches(in_vec, x), regmatches(in_vec, x, TRUE))
    #[1] "abcd;;ghi;;dfsF(adffg;adfsasdf);;dfg;;(asd;fdsg);;ag"
    #[2] "zvc;;dfasdf;;asdga;;asd(asd;hsfd)"
    #[3] "adsg;;(asdg;ASF;DFG;ASDF;);;sdafdf"
    #[4] "asagf;;(fafgf;sadg;sdag;a;gddfg;fd)gsfg;;sdfa"
    但所有这些都只适用于简单的打开 (关闭 )组合。

    关于正则表达式仅用于替换括号外的特定字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67886333/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com