gpt4 book ai didi

Java 字符串拆分为字母数字和新行?

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:28:19 25 4
gpt4 key购买 nike

我有一个包含几行的 test.txt 文件,例如:

"h3llo, @my name is, bob! (how are you?)"

"i am fine@@@@@"

我想将所有字母数字字符和换行符拆分成一个数组列表,这样输出就是

output = ["h", "llo", "my", "name", "is", "bob", "how", "are", "you", "i", "am", "fine"]

现在,我尝试用

拆分我的文本
output.split("\\P{Alpha}+")

但出于某种原因,这似乎在数组列表的第一个位置添加了一个逗号,并用空字符串替换了换行符

output = ["", "h", "llo", "my", "name", "is", "bob", "how", "are", "you", "", "i", "am", "fine"]

还有其他方法可以解决这个问题吗?谢谢!

--

编辑:我怎样才能确保它忽略新行?

最佳答案

Java 的 String.split() 行为非常困惑。一个更好的拆分实用程序是 GuavaSplitter .他们的documentation详细介绍了 String.split() 的问题:

The built in Java utilities for splitting strings can have some quirky behaviors. For example, String.split silently discards trailing separators, and StringTokenizer respects exactly five whitespace characters and nothing else.

Quiz: ",a,,b,".split(",") returns...

  1. "", "a", "", "b", ""
  2. null, "a", null, "b", null
  3. "a", null, "b"
  4. "a", "b"
  5. None of the above

The correct answer is none of the above: "", "a", "", "b". Only trailing empty strings are skipped. What is this I don't even.

在您的情况下,这应该有效:

Splitter.onPattern("\\P{Alpha}+").omitEmptyStrings().splitToList(output);

关于Java 字符串拆分为字母数字和新行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34771561/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com