(?.+?)The UK Bribery Act (“the Act”) received Royal-6ren">
gpt4 book ai didi

javascript - 捕获开始但不捕获结束标记

转载 作者:行者123 更新时间:2023-12-04 10:34:39 25 4
gpt4 key购买 nike

我想分割父块,同时沿着每个段的文本捕获嵌套标签:

(?<tag>.)(?: href="(?<url>.+?)")?>(?<text>.+?)<

它有效,但我希望“标签”在未包装在标签中的段中为空,但是对于当前的注册,这些捕获了前一个段的结束标签...:(

现场 sample : https://regex101.com/r/UEZAaw/3/

我想获得的结果集,注意第2项和第4项应该有 null对于标签:
{
"0":{
match: "p>The <",
tag: "p",
url: null,
text: "The "
},
"1":[
match: "a href=\"https://www.legislation.gov.uk/ukpga/2010/23/contents\">UK Bribery Act<",
tag: "a",
url: "https://www.legislation.gov.uk/ukpga/2010/23/contents",
text: "UK Bribery Act"
],
"2":[
match: "/a> (“the Act”) received Royal Assent in April 2010 and came into ... <",
tag: null
url: null,
text: " (“the Act”) received Royal Assent in April 2010 and came into ... "
],
"3":[
match: "a href=\"http://www.oecd.org/daf/anti-bribery/ConvCombatBribery_ENG.pdf\">OECD anti-bribery Convention<",
tag: "a",
url: "http://www.oecd.org/daf/anti-bribery/ConvCombatBribery_ENG.pdf",
text: "OECD anti-bribery Convention"
],
"4":[
match: "/a>. The Act outlined four prime offences, including the introduction ... <",
tag: null,
url: null,
text: ". The Act outlined four prime offences, including the introduction ... "
],
"5":[
match: "b>rest is history<",
tag: "b",
url: null,
text: "rest is history"
]
...
}

花了几个小时,还没有弄清楚,非常感谢您的建议。

最佳答案

我认为这是有效的,基于我在 regex101 的 MATCH INFORMATION 框中看到的内容:

/(?:(?<tag>(?<!\/).)|(?:\/.))(?: href="(?<url>.+?)")?>(?<text>.+?)</gm

关于javascript - 捕获开始但不捕获结束标记,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60240332/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com