gpt4 book ai didi

java - 如何删除不平衡/不成对的双引号(在 Java 中)

转载 作者:搜寻专家 更新时间:2023-11-01 03:11:08 28 4
gpt4 key购买 nike

想到这里把这个比较聪明的问题分享给大家。我正在尝试从字符串中删除不平衡/不成对的双引号。

我的工作正在进行中,我可能接近解决方案。但是,我还没有找到可行的解决方案。 我无法从字符串中删除未配对/未配对的双引号。

示例输入

string1=injunct! alter ego."
string2=successor "alter ego" single employer" "proceeding "citation assets"

输出应该是

string1=injunct! alter ego.
string2=successor "alter ego" single employer proceeding "citation assets"

这个问题听起来类似于 Using Java remove unbalanced/unpartnered parenthesis

到目前为止,这是我的代码(它不会删除所有未配对的双引号)

private String removeUnattachedDoubleQuotes(String stringWithDoubleQuotes) {
String firstPass = "";

String openingQuotePattern = "\\\"[a-z0-9\\p{Punct}]";
String closingQuotePattern = "[a-z0-9\\p{Punct}]\\\"";

int doubleQuoteLevel = 0;
for (int i = 0; i < stringWithDoubleQuotes.length() - 3; i++) {
String c = stringWithDoubleQuotes.substring(i, i + 2);
if (c.matches(openingQuotePattern)) {
doubleQuoteLevel++;
firstPass += c;
}
else if (c.matches(closingQuotePattern)) {
if (doubleQuoteLevel > 0) {
doubleQuoteLevel--;
firstPass += c;
}
}
else {
firstPass += c;
}
}

String secondPass = "";
doubleQuoteLevel = 0;
for (int i = firstPass.length() - 1; i >= 0; i--) {
String c = stringWithDoubleQuotes.substring(i, i + 2);
if (c.matches(closingQuotePattern)) {
doubleQuoteLevel++;
secondPass = c + secondPass;
}
else if (c.matches(openingQuotePattern)) {
if (doubleQuoteLevel > 0) {
doubleQuoteLevel--;
secondPass = c + secondPass;
}
}
else {
secondPass = c + secondPass;
}
}

String result = secondPass;

return result;
}

最佳答案

如果没有嵌套,它可能可以在单个正则表达式中完成。
粗略定义了一个分隔符的概念,可以‘偏向’
这些规则以获得更好的结果。
这完全取决于规定的规则。此正则表达式考虑了
三种可能的情况;

  1. 有效对
  2. 无效对(有偏差)
  3. 无效单

它也不会解析超出行尾的“”。但它确实做了很多
行合并为一个字符串。要更改它,请在您看到它的地方删除 \n


全局上下文 - 原始查找正则表达式
缩短

(?:("[a-zA-Z0-9\p{Punct}][^"\n]*(?<=[a-zA-Z0-9\p{Punct}])")|(?<![a-zA-Z0-9\p{Punct}])"([^"\n]*)"(?![a-zA-Z0-9\p{Punct}])|")

替换分组

$1$2 or \1\2

扩展的原始正则表达式:

(?:                            // Grouping
// Try to line up a valid pair
( // Capt grp (1) start
" // "
[a-zA-Z0-9\p{Punct}] // 1 of [a-zA-Z0-9\p{Punct}]
[^"\n]* // 0 or more non- [^"\n] characters
(?<=[a-zA-Z0-9\p{Punct}]) // 1 of [a-zA-Z0-9\p{Punct}] behind us
" // "
) // End capt grp (1)

| // OR, try to line up an invalid pair
(?<![a-zA-Z0-9\p{Punct}]) // Bias, not 1 of [a-zA-Z0-9\p{Punct}] behind us
" // "
( [^"\n]* ) // Capt grp (2) - 0 or more non- [^"\n] characters
" // "
(?![a-zA-Z0-9\p{Punct}]) // Bias, not 1 of [a-zA-Z0-9\p{Punct}] ahead of us

| // OR, this single " is considered invalid
" // "
) // End Grouping

Perl 测试用例(没有 Java)

$str = '
string1=injunct! alter ego."
string2=successor "alter ego" single employer "a" free" proceeding "citation assets"
';

print "\n'$str'\n";

$str =~ s
/
(?:
(
"[a-zA-Z0-9\p{Punct}]
[^"\n]*
(?<=[a-zA-Z0-9\p{Punct}])
"
)
|
(?<![a-zA-Z0-9\p{Punct}])
"
( [^"\n]* )
" (?![a-zA-Z0-9\p{Punct}])
|
"
)
/$1$2/xg;

print "\n'$str'\n";

输出

'
string1=injunct! alter ego."
string2=successor "alter ego" single employer "a" free" proceeding "citation assets"
'

'
string1=injunct! alter ego.
string2=successor "alter ego" single employer "a" free proceeding "citation assets"
'

关于java - 如何删除不平衡/不成对的双引号(在 Java 中),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9929168/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com