gpt4 book ai didi

java - 正则表达式: Match pattern `foo` but not incase if it occurs after pattern `bar`

转载 作者:太空宇宙 更新时间:2023-11-04 10:35:32 25 4
gpt4 key购买 nike

匹配模式foo但如果它出现在模式 bar 之后则不然。基本上给定一个字符串,我“尝试”匹配开始标签 <任何字符串 >如果匹配位于结束标记 </ 之后,则不应发生匹配任何字符串 >

注意:我正在“尝试”类似的方法来解决,这可能不是解决方案的实际路径。如果您能帮助解决当前问题,我将非常高兴。

所以它应该匹配:
<h1><h1>
<h1><h1> abc </h1>
<abc><abc>something</cde><efg>
<abc>something<abc>something

不应匹配以下内容:
</h1>
</abc> one two three <abc> five six <abc>
one two three </abc> five six <abc>

最佳答案

最简单的解决方案是将部分工作外包给 java regex API。使用正则表达式,我们只能匹配 <[^>]*> ,即任何 html 标签。然后我们可以使用Matcher.region()将匹配限制为任何 </ 之前的字符串.

这是代码:

    // example data
String[] inputLines = {
"<h1>",
"<h1> abc </h1>",
"<abc>something</cde><efg>",
"something<abc>something",
"",
"</h1>",
"</abc> one two three <abc> five six <abc>",
"one two three </abc> five six <abc>"
};

// the pattern for any html tag
Pattern pattern = Pattern.compile("<[^>]*>");

for (String line : inputLines) {
Matcher matcher = pattern.matcher(line);
// the index that we must not search after
int undesiredStart = line.indexOf("</");

// undesiredStart == -1 ? line.length() : undesiredStart handles the undesired not found case. In that case the region end must be the length of the string
matcher.region(0, undesiredStart == -1 ? line.length() : undesiredStart);

// this is the idiom to iterate through the matches
while (matcher.find()) {
System.out.println(matcher.group());
}
}

关于java - 正则表达式: Match pattern `foo` but not incase if it occurs after pattern `bar` ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49537019/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com