gpt4 book ai didi

java - 在正则表达式中没有正确获得 * 量词?

转载 作者:行者123 更新时间:2023-12-03 18:32:54 28 4
gpt4 key购买 nike

我是正则表达式的新手,我正在浏览 the regex quantifier section .我对 * 量词有疑问。下面是 * 量词的定义:

  • X* - 没有找到或找到多个字母 X
  • .* - 任何字符序列

根据上面的定义,我写了一个小程序:

public static void testQuantifier() {
String testStr = "axbx";
System.out.println(testStr.replaceAll("x*", "M"));
//my expected output is MMMM but actual output is MaMMbMM
/*
Logic behind my expected output is:
1. it encounters a which means 0 x is found. It should replace a with M.
2. it encounters x which means 1 x is found. It should replace x with M.
3. it encounters b which means 0 x is found. It should replace b with M.
4. it encounters x which means 1 x is found. It should replace x with M.
so output should be MMMM but why it is MaMMbMM?
*/

System.out.println(testStr.replaceAll(".*", "M"));
//my expected output is M but actual output is MM

/*
Logic behind my expected output is:
It encounters axbx, which is any character sequence, it should
replace complete sequence with M.
So output should be M but why it is MM?
*/
}

更新:-

根据修改后的理解,我希望输出为 MaMMbM 而不是 MaMMbMM。所以我不明白为什么我最后得到了额外的 M?

我对第一个正则表达式的修改理解是:

1. it encounters a which means 0 x is found. It should replace a with Ma.
2. it encounters x which means 1 x is found. It should replace x with M.
3. it encounters b which means 0 x is found. It should replace b with Mb.
4. it encounters x which means 1 x is found. It should replace x with M.
5. Lastly it encounters end of string at index 4. So it replaces 0x at end of String with M.

(虽然我觉得考虑字符串结尾的索引也很奇怪)

所以第一部分现在很清楚了。

此外,如果有人可以澄清第二个正则表达式,那将会很有帮助。

最佳答案

这就是你出错的地方:

first it encounters a which means 0 x is found. So it should replace a with M.

否 - 这意味着找到了 0 个x然后找到了一个a。你没有说 a 应该被替换为 M...你说过任意数量的 x(包括 0 ) 应替换为 M

如果您希望每个 字符都被M 替换,您应该只使用:

System.out.println(testStr.replaceAll(".", "23"));

(我个人希望得到 MaMbM 的结果 - 我正在研究为什么你会得到 MaMMbMM - 我怀疑这是因为有一个序列为 0 xxb 之间,但对我来说还是有点奇怪。)

编辑:如果您查看模式匹配的位置,它会变得更加清晰。下面是代码来说明这一点:

Pattern pattern = Pattern.compile("x*");
Matcher matcher = pattern.matcher("axbx");
while (matcher.find()) {
System.out.println(matcher.start() + "-" + matcher.end());
}

结果(请记住,结尾是唯一的)和一些解释:

0-0 (index 0 = 'a', doesn't match)
1-2 (index 1 = 'x', matches)
2-2 (index 2 = 'b', doesn't match)
3-4 (index 3 = 'x', matches)
4-4 (index 4 is the end of the string)

如果您将每个匹配项替换为“M”,您最终会得到实际得到的输出。

我认为根本问题是,如果你有一个可以匹配(完整地)空字符串的模式,你可以争辩说该模式在任意字符串之间出现无限次输入中的两个字符。我可能会尽可能避免使用此类模式 - 确保任何匹配都必须至少包含一个字符。

关于java - 在正则表达式中没有正确获得 * 量词?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17335593/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com