gpt4 book ai didi

java - 内部包含尖括号的正则表达式 XML 标签

转载 作者:行者123 更新时间:2023-11-30 02:58:46 30 4
gpt4 key购买 nike

我需要一个正则表达式,它将给我一个 XML 标签,例如<ABC/><ABC></ABC>

所以,如果我使用<(.)+?> ,它会给我<ABC><ABC></ABC> 。这很好。

现在,问题是:

我有一个 XML 作为

<VALUE ABC="10000" PQR="12422700" ADJ="" PROD_TYPE="COCOG EFI LWL P&amp;C >1Y-5Y" SRC="BASE" DATA="data" ACTION="INSERT" ID="100000" GRC_PROD=""/>

在这里,如果您看到 PROD_TYPE="COCOG EFI LWL P&amp;C >1Y-5Y"属性值中有一个大于符号。

所以,正则表达式返回我

<VALUE ABC="10000" PQR="12422700" ADJ="" PROD_TYPE="COCOG EFI LWL P&amp;C >

而不是完整

<VALUE ABC="10000" PQR="12422700" ADJ="" PROD_TYPE="COCOG EFI LWL P&amp;C >1Y-5Y" SRC="BASE" DATA="data" ACTION="INSERT" ID="100000" GRC_PROD=""/>

我需要一些正则表达式,它不会考虑作为值一部分的小于和大于符号,即用双引号引起来。

最佳答案

你可以试试这个:

(?i)<[a-z][\w:-]+(?: [a-z][\w:-]+="[^"]*")*/?>

解释如下:

(?i)         # Match the remainder of the regex with the options: case insensitive (i)
< # Match the character “<” literally
[a-z] # Match a single character in the range between “a” and “z”
[\\w:-] # Match a single character present in the list below
# A word character (letters, digits, and underscores)
# The character “:”
# The character “-”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?: # Match the regular expression below
\\ # Match the character “ ” literally
[a-z] # Match a single character in the range between “a” and “z”
[\\w:-] # Match a single character present in the list below
# A word character (letters, digits, and underscores)
# The character “:”
# The character “-”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
=\" # Match the characters “=\"” literally
[^\"] # Match any character that is NOT a “\"”
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\" # Match the character “\"” literally
)* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
/ # Match the character “/” literally
? # Between zero and one times, as many times as possible, giving back as needed (greedy)
> # Match the character “>” literally

如果您喜欢包含 opencloseself-close 标签,请尝试下面的 RegEx :

(?i)(?:<([a-z][\w:-]+)(?: [a-z][\w:-]+="[^"]*")*>.+?</\1>|<([a-z][\w:-]+)(?: [a-z][\w:-]+="[^"]*")*/>)

实现相同功能的 java 代码片段:

try {
boolean foundMatch = subjectString.matches("(?i)(?:<([a-z][\\w:-]+)(?: [a-z][\\w:-]+=\"[^\"]*\")*>.+?</\\1>|<([a-z][\\w:-]+)(?: [a-z][\\w:-]+=\"[^\"]*\")*/>)");
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}

希望这有帮助...

关于java - 内部包含尖括号的正则表达式 XML 标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36476312/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com