gpt4 book ai didi

java - 如何在文本中找到复合字符串

转载 作者:太空宇宙 更新时间:2023-11-04 10:42:43 24 4
gpt4 key购买 nike

我一直在寻找解决方案,以在句子中查找类似 howareyou 的字符串并将其从中删除。例如:

我们有一句话 - 你好,你好吗?

还有复合 - 你好吗因此,我想要这个字符串 - Hello there, ? 删除复合。

我当前的解决方案是将字符串拆分为单词并检查复合是否包含每个单词,但效果不佳,因为如果您有其他单词与该复合匹配,它们也会被删除,例如:

如果我们在此字符串中查找 foreseenfuture - 我已经预见了你们所有人的 future ,那么根据我的解决方案,for 也将被删除,因为它位于复合体内部。

代码

String[] words = text.split("[^a-zA-Z]");
String compound = "foreseenfuture";

int startIndex = -1;
int endIndex = -1;

for(String word : words){
if(compound.contains(word)){
if(startIndex == -1){
startIndex = text.indexOf(word);
}

endIndex = text.indexOf(word) + word.length() - 1;
}
}

if(startIndex != -1 && endIndex != -1){
text = text.substring(0, startIndex) + "" + text.substring(endIndex + 1, text.length() - 1);
}

那么,还有什么办法可以解决这个问题吗?

最佳答案

我假设当你复合时你只会删除空格。因此,有了这个假设,“for,seen future.for saw future”将变成“for,seen future.”,因为逗号分隔了另一个复合词。在这种情况下,这应该有效:

    String example1 = "how are you?";
String example2 = "how, are you... here?";
String example3 = "Madam, how are you finding the accommodations?";
String example4 = "how are you how are you how are you taco";

String compound = "howareyou";

StringBuilder compoundRegexBuilder = new StringBuilder();

//This matches to a word boundary before the first word
compoundRegexBuilder.append("\\b");

// inserts each character into the regex
for(int i = 0; i < compound.length(); i++) {
compoundRegexBuilder.append(compound.charAt(i));

// between each letter there could be any amount of whitespace
if(i<compound.length()-1) {
compoundRegexBuilder.append("\\s*");
}
}

// Makes sure the last word isn't part of a larger word
compoundRegexBuilder.append("\\b");

String compoundRegex = compoundRegexBuilder.toString();
System.out.println(compoundRegex);
System.out.println("Example 1:\n" + example1 + "\n" + example1.replaceAll(compoundRegex, ""));
System.out.println("\nExample 2:\n" + example2 + "\n" + example2.replaceAll(compoundRegex, ""));
System.out.println("\nExample 3:\n" + example3 + "\n" + example3.replaceAll(compoundRegex, ""));
System.out.println("\nExample 4:\n" + example4 + "\n" + example4.replaceAll(compoundRegex, ""));

输出如下:

\bh\s*o\s*w\s*a\s*r\s*e\s*y\s*o\s*u\b
Example 1:
how are you?
?

Example 2:
how, are you... here?
how, are you... here?

Example 3:
Madam, how are you finding the accommodations?
Madam, finding the accommodations?

Example 4:
how are you how are you how are you taco
taco

您还可以使用它来匹配任何其他字母数字组合。

关于java - 如何在文本中找到复合字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48776470/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com