gpt4 book ai didi

java - 用于字符串搜索的 KMP 算法?

转载 作者:塔克拉玛干 更新时间:2023-11-03 04:04:58 24 4
gpt4 key购买 nike

我在网上发现了这个非常具有挑战性的编码问题,我想尝试一下。

一般的想法是给定字符串T和模式P,找到这个模式的出现,总结它的对应值并返回最大值和最小值。如果您想更详细地阅读问题,请参阅 this

然而,下面是我提供的代码,它适用于一个简单的测试用例,但是当运行在多个复杂的测试用例上时它非常慢,我不确定我的代码在哪里需要优化。

任何人都可以帮助我在逻辑错误的地方。

public class DeterminingDNAHealth {


private DeterminingDNAHealth() {
/*
* Fixme:
* Each DNA contains number of genes
* - some of them are beneficial and increase DNA's total health
* - Each Gene has a health value
* ======
* - Total health of DNA = sum of all health values of beneficial genes
*/
}

int checking(int start, int end, String pattern) {
String[] genesChar = new String[] {
"a",
"b",
"c",
"aa",
"d",
"b"
};
String numbers = "123456";

int total = 0;

for (int i = start; i <= end; i++) {
total += KMPAlgorithm.initiateAlgorithm(pattern, genesChar[i]) * (i + 1);
}

return total;
}

public static void main(String[] args) {

String[] genesChar = new String[] {
"a",
"b",
"c",
"aa",
"d",
"b"
};
Gene[] genes = new Gene[genesChar.length];

for (int i = 0; i < 6; i++) {
genes[i] = new Gene(genesChar[i], i + 1);
}

String[] checking = "15caaab 04xyz 24bcdybc".split(" ");


DeterminingDNAHealth DNA = new DeterminingDNAHealth();
int i, mostHealthiest, mostUnhealthiest;

mostHealthiest = Integer.MIN_VALUE;
mostUnhealthiest = Integer.MAX_VALUE;

for (i = 0; i < checking.length; i++) {
int start = Character.getNumericValue(checking[i].charAt(0));
int end = Character.getNumericValue(checking[i].charAt(1));
String pattern = checking[i].substring(2, checking[i].length());

int check = DNA.checking(start, end, pattern);

if (check > mostHealthiest)
mostHealthiest = check;
else
if (check < mostUnhealthiest)
mostUnhealthiest = check;
}

System.out.println(mostHealthiest + " " + mostUnhealthiest);

// DNA.checking(1,5, "caaab");
}
}

KMP算法

public class KMPAlgorithm {

KMPAlgorithm() {}


public static int initiateAlgorithm(String text, String pattern) {

// let us generate our LPC table from the pattern
int[] partialMatchTable = partialMatchTable(pattern);

int matchedOccurrences = 0;

// initially we don't have anything matched, so 0
int partialMatchLength = 0;

// we then start to loop through the text, !note, not the pattern. The text that we are testing the pattern on
for (int i = 0; i < text.length(); i++) {

// if there is a mismatch and there's no previous match, then we've hit the base-case, hence break from while{...}
while (partialMatchLength > 0 && text.charAt(i) != pattern.charAt(partialMatchLength)) {

/*
* otherwise, based on the number of chars matched, we decrement it by 1.
* In fact, this is the unique part of this algorithm. It is this part that we plan to skip partialMatchLength
* iterations. So if our partialMatchLength was 5, then we are going to skip (5 - 1) iteration.
*/
partialMatchLength = partialMatchTable[partialMatchLength - 1];

}


// if however we have a char that matches the current text[i]
if (text.charAt(i) == pattern.charAt(partialMatchLength)) {

// then increment position, so hence we check the next char of the pattern against the next char in text
partialMatchLength++;

// we will know that we're at the end of the pattern matching, if the matched length is same as the pattern length
if (partialMatchLength == pattern.length()) {
// to get the starting index of the matched pattern in text, apply this formula (i - (partialMatchLength - 1))

// this line increments when a match string occurs multiple times;
matchedOccurrences++;

// just before when we have a full matched pattern, we want to test for multiple occurrences, so we make
// our match length incomplete, and let it run longer.
partialMatchLength = partialMatchTable[partialMatchLength - 1];

}
}

}

return matchedOccurrences;


}


private static int[] partialMatchTable(String pattern) {
/*
* TODO
* Note:
* => Proper prefix: All the characters in a string, with one or more cut off the end.
* => proper suffix: All the characters in a string, with one or more cut off the beginning.
*
* 1.) Take the pattern and construct a partial match table
*
* To construct partial match table {
* 1. Loop through the String(pattern)
* 2. Create a table of size String(pattern).length
* 3. For each character c[i], get The length of the longest proper prefix in the (sub)pattern
* that matches a proper suffix in the same (sub)pattern
* }
*/

// we will need two incremental variables
int i, j;

// an LSP table also known as “longest suffix-prefix”
int[] LSP = new int[pattern.length()];


// our initial case is that the first element is set to 0
LSP[0] = 0;

// loop through the pattern...
for (i = 1; i < pattern.length(); i++) {

// set our j as previous elements data (not the index)
j = LSP[i - 1];


// we will be comparing previous and current elements data. ei char
char current = pattern.charAt(i), previous = pattern.charAt(j);

// we will have a case when we're somewhere in loop and two chars will not match, and j is not in base case.
while (j > 0 && current != previous)
// we decrement our j
j = LSP[j - 1];

// simply put, if two characters are same, then we update our LSP to say that at that point, we hold the j's value
if (current == previous)
// increment our j
j++;

// update the table
LSP[i] = j;


}

return LSP;

}
}

来源代码归功于 Github

最佳答案

您可以尝试这个 KMP 实现。它是 O(m+n),正如 KMP 的意图。它应该快得多:

private static int[] failureFunction(char[] pattern) {
int m = pattern.length;

int[] f = new int[pattern.length];
f[0] = 0;

int i = 1;
int j = 0;

while (i < m) {
if (pattern[i] == pattern[j]) {
f[i] = j + 1;
i++;
j++;
} else if (j > 0) {
j = f[j - 1];
} else {
f[i] = 0;
i++;
}
}
return f;
}

private static int kmpMatch(char[] text, char[] pattern) {
int[] f = failureFunction(pattern);

int m = pattern.length;
int n = text.length;

int i = 0;
int j = 0;

while (i < n) {
if (pattern[j] == text[i]) {
if (j == m - 1){
return i - (m - 1);
} else {
i++;
j++;
}
} else if (j > 0) {
j = f[j - 1];
} else {
i++;
}
}
return -1;
}

关于java - 用于字符串搜索的 KMP 算法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46821536/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com