gpt4 book ai didi

Java Bioinformatics - 获取字符串中多个特定单词的所有索引

转载 作者:塔克拉玛干 更新时间:2023-11-02 19:19:46 24 4
gpt4 key购买 nike

我在大学的生物信息学类(class)中有一个项目,我项目中的其中一件事是基因预测。

我今天的问题是如何获取字符串中多个特定单词的所有索引。例如,在我这里的例子中,我想找到所有出现的起始密码子 ("AUG") 和终止密码子 ("UAA","UAG", "UGA") 并使用它们来预测基因,只需尝试做 开放阅读框 (ORF)

这是我的初始代码:

private void jButton3ActionPerformed(java.awt.event.ActionEvent evt) {                                         
// TODO add your handling code here:
// textArea1.setText(null);\
String str = jTextField1.getText(), y = "", gene = "", dnax = "", text = "";
SymbolList dna = null;
int start_codon_index = -1, stop_codon_index = -1;
if ("".equals(str)) {
jTextArea1.setText("No DNA strand entered.. ");

} else {
if (checksum(str) == 100) {
try {
dna = DNATools.createDNA(str);
} catch (IllegalSymbolException ex) {
Logger.getLogger(m.class.getName()).log(Level.SEVERE, null, ex);
}
try {
dna = DNATools.toRNA(dna);
} catch (IllegalAlphabetException ex) {
Logger.getLogger(m.class.getName()).log(Level.SEVERE, null, ex);
}
dnax = dna.seqString().toUpperCase();
if (dnax.length() % 3 != 0) {
if (dnax.length() % 3 == 1) {
dnax += "-";
}
if (dnax.length() % 3 == 2) {
dnax += "-";
}
}
// System.out.println(dnax);
for (int g = 0; g < dnax.length(); g += 3) {
y = dnax.substring(g, g + 3);
if ("AUG".equals(y)) {
start_codon_index = g;
} else if (start_codon_index != -1 && ("UGA".equals(y) || "UAG".equals(y) || "UAA".equals(y))) {

stop_codon_index = g + 3;

}
}

if (stop_codon_index != -1 && start_codon_index != -1) {
String k = "";
int a = 0;
for (a = start_codon_index; a < stop_codon_index; a++) {
gene += dnax.charAt(a);

}
text += "\nGene start position: " + start_codon_index + "\nGene end position: " + a + "\n Gene: " + gene;
jTextArea1.setText(text);

} else {

jTextArea1.setText("No genes found in Seq: " + dnax);

}
} else {
jTextArea1.setText("Text entered is not a DNA strand..");
}
}
}

这里是 checksum() 方法:

private static int checksum(String x) {
int i = 0, checks = 0, count = 0;
char c;
x = x.toUpperCase();
while (i < x.length()) {
c = x.charAt(i);
if (c == 'A' || c == 'T' || c == 'G' || c == 'C' || c == '-') {



count++;
}
i++;
}
try {
checks = (count / x.length()) * 100;
} catch (Exception e) {
e.printStackTrace();
}

return checks;
}

我尝试过其他解决方案,但对我来说没有任何效果。欢迎任何帮助/建议。

最佳答案

我想你是在问如何找到那些特定密码子的索引? dnax 是您要检查的字符串吗?

您可以使用 indexOf(String str, int fromIndex)。如果未找到子字符串,则返回 -1。

所以也许这样的事情可能会有所帮助,

List<Integer> startCodonIndices = new ArrayList<Integer>();
int index;
for (int i=0; i+3<dnax.length(); i++) {
index = indexOf("AUG", i);
startCodonIndices.add(index);
}

关于Java Bioinformatics - 获取字符串中多个特定单词的所有索引,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/30283118/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com