gpt4 book ai didi

Java: Misspell reporter using regex (如何解析拼写错误)

转载 作者:塔克拉玛干 更新时间:2023-11-03 06:23:53 25 4
gpt4 key购买 nike

我正在创建一个程序来解析字符串以报告拼写错误的实例。我希望它报告与一个变量相对的多个实例。例如,我让它解释用户输入;

GOOGGOUGGUIG

并获取该字符串并报告所有“GO”拼写错误 4 次的实例,因为如上面的用户条目所示,我们有“OG”、“UG”、“GU”和“IG”。

所以我的结果应该是

Y was spelled incorrectly x/count times.

我不关心模式反转部分。当我使用单个变量时,我只用它来查找实例。

import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class misspellReporter
{
public static void main(String[] args)
{
Scanner keyboard = new Scanner(System.in);
String singleString = "";
System.out.println("Enter text here");
singleString = keyboard.nextLine();

String str = singleString;
//String strToSearch = "OG"; //I used this at first
String[] strToSearch = {"GU", "UG", "IG", "GI"}; //I want to use this array instead
String strToSearchReversed = new StringBuffer(strToSearch).reverse().toString();
Pattern strPattern = Pattern.compile(strToSearchReversed);
Matcher matcher = strPattern.matcher(str);
int counter = 0;
while(matcher.find()) {
++counter;
}

System.out.println(strToSearch+" was spelt as "+strToSearchReversed+" "+counter+" times");
}
}

提前致谢!这个问题对我来说不同的原因是因为我没有在论坛上看到任何其他人使用匹配器和模式进行解析。我使用过其他方法,但我对这个方法有特定的操作感兴趣。

最佳答案

您可以使用如下组成的正则表达式同时搜索多个子字符串:

public class MatchPairs {
private static final String[] strs = {"GU", "UG", "IG", "GI"};
public static int matches( String str ){
String strToSearch = String.join( "|", strs );
Pattern strPattern = Pattern.compile(strToSearch);
Matcher matcher = strPattern.matcher(str);
int counter = 0;
while(matcher.find()) {
++counter;
}
return counter;
}
}

通过反转组合并将其追加到另一个 | 之后,您可以省去添加反转子串的麻烦。输出:

 GOOGGOUGGUIG was spelt as GU|UG|IG|GI 3 times

为避免重叠匹配,设置起始偏移量:

public class MatchNoOverlap {
private static final String[] strs = {"GU", "UG", "IG", "GI"};
public static int matches( String str ){
String strToSearch = String.join( "|", strs );
Pattern strPattern = Pattern.compile(strToSearch);
Matcher matcher = strPattern.matcher(str);
int counter = 0;
int start = 0;
while(matcher.find(start)) {
++counter;
start = matcher.start() + 2;
}
return counter;
}
public static void main( String[] args ){
System.out.println( matches( "GOOGGOUGGUGIGI" ) );
}
}

稍后

/* Counts the number of contiguous stretches of non-valid pairs between
* contiguous stretches of valid pairs
*/
private static final String[] valids =
{"AT", "TA", "AA", "TT", "CG", "GC", "CC", "GG"};

public static int mismatches( String str ){
String strToSearch = "(?:(?:..)*?)((?:" + String.join( "|", valids) + ")+)";
Pattern strPattern = Pattern.compile( strToSearch);
Matcher matcher = strPattern.matcher(str);
int counter = 0;
int start = 0;
int end = 0;
while(matcher.find( start )){
int s = matcher.start(1);
end = matcher.end(1);
if( s > start ){
++counter;
// System.out.println( "s>Start " + s );
}
// System.out.println( "match:" + matcher.group() + " s=" + s );
start = matcher.end();
}
if( end < str.length() ){
++counter;
// System.out.println( "end< length" );
}
return counter;
}

**或者,计算每个“坏对”:

public static int badPairs( String str ){
String strToSearch = "(?:(?:..)*?)((?:" + String.join( "|", valids) + ")+)";
Pattern strPattern = Pattern.compile( strToSearch);
Matcher matcher = strPattern.matcher(str);
int counter = 0;
int start = 0;
int end = 0;
while(matcher.find( start )){
int s = matcher.start(1);
end = matcher.end(1);
counter += s - start;
start = matcher.end();
}
counter += str.length() - end;
return counter/2;
}

没有正则表达式

public static int valid( String str ){
Set<String> valset = new HashSet<>();
for( String s: valids ) valset.add( s );
int validCount = 0;
for( int i = 0; i < str.length(); i += 2 ){
if( valset.contains( str.substring( i, i+2 ) ) ) validCount++;
}
return validCount;
}

关于Java: Misspell reporter using regex (如何解析拼写错误),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28791937/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com