gpt4 book ai didi

java - 从字符串中删除字母数字词

转载 作者:行者123 更新时间:2023-11-30 07:08:07 25 4
gpt4 key购买 nike

我正在尝试从字符串中删除字母数字单词..

 String[] sentenceArray= {"India123156 hel12lo 10000 cricket 21355 sport news 000Fifa"};
for(String s: sentenceArray)
{
String finalResult = new String();
String finalResult1 = new String();
String str= s.toString();
System.out.println("before regex : "+str);
String regex = "(\\d?[,/%]?\\d|^[a-zA-Z0-9_]*)";
finalResult1 = str.replaceAll(regex, " ");
finalResult = finalResult1.trim().replaceAll(" +", " ");
System.out.println("after regex : "+finalResult);
}

输出:hello cricket sport news Fifa

但我需要的输出是:板球运动新闻

请各位大侠帮忙提前谢谢你

最佳答案

要匹配您要排除的单词和后面的空格字符,您可以在不区分大小写的模式下使用以下正则表达式 (demo):

\b(?=[a-z]*\d+)\w+\s*\b

在 Java 中,要替换它,您可以这样做:

String replaced = your_original_string.replaceAll("(?i)\\b(?=[a-z]*\\d+[a-z]*)\\w+\\s*\\b", "");

逐个 token 的解释

\b                       # the boundary between a word char (\w) and
# something that is not a word char
(?= # look ahead to see if there is:
[a-z]* # any character of: 'a' to 'z' (0 or more
# times (matching the most amount
# possible))
\d+ # digits (0-9) (1 or more times (matching
# the most amount possible))
) # end of look-ahead
\w+ # word characters (a-z, A-Z, 0-9, _) (1 or
# more times (matching the most amount
# possible))
\s* # whitespace (\n, \r, \t, \f, and " ") (0 or
# more times (matching the most amount
# possible))
\b # the boundary between a word char (\w) and
# something that is not a word char

关于java - 从字符串中删除字母数字词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24256196/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com