gpt4 book ai didi

java - 寻找 DNA Java 的超序列

转载 作者:塔克拉玛干 更新时间:2023-11-02 19:27:40 25 4
gpt4 key购买 nike

我正在努力研究“查找超序列”算法。

输入是一组字符串

String A = "caagccacctacatca";
String B = "cgagccatccgtaaagttg";
String C = "agaacctgctaaatgctaga";

结果将是一组正确对齐的字符串(下一步应该是合并)

String E = "ca ag cca  cc ta    cat  c a";
String F = "c gag ccat ccgtaaa g tt g";
String G = " aga acc tgc taaatgc t a ga";

谢谢你的任何建议(我在这个任务上坐了一天多)

合并后的超字符串将是

cagagaccatgccgtaaatgcattacga

“这种情况”中超序列的定义类似于

当且仅当字符串 R 中的所有字符都按照它们在输入序列 R 中出现的顺序出现在超序列 S 中时,字符串 R 才包含在超序列 S 中。


我尝试过的“解决方案”(同样是错误的做法)是:

public class Solution4
{
static boolean[][] map = null;
static int size = 0;

public static void main(String[] args)
{
String A = "caagccacctacatca";
String B = "cgagccatccgtaaagttg";
String C = "agaacctgctaaatgctaga";

Stack data = new Stack();
data.push(A);
data.push(B);
data.push(C);


Stack clone1 = data.clone();
Stack clone2 = data.clone();

int length = 26;
size = max_size(data);

System.out.println(size+" "+length);
map = new boolean[26][size];

char[] result = new char[size];

HashSet<String> chunks = new HashSet<String>();
while(!clone1.isEmpty())
{
String a = clone1.pop();

char[] residue = make_residue(a);

System.out.println("---");
System.out.println("OLD : "+a);
System.out.println("RESIDUE : "+String.valueOf(residue));


String[] r = String.valueOf(residue).split(" ");

for(int i=0; i<r.length; i++)
{
if(r[i].equals(" ")) continue;
//chunks.add(spaces.substring(0,i)+r[i]);
chunks.add(r[i]);
}
}

for(String chunk : chunks)
{
System.out.println("CHUNK : "+chunk);
}
}

static char[] make_residue(String candidate)
{
char[] result = new char[size];
for(int i=0; i<candidate.length(); i++)
{
int pos = find_position_for(candidate.charAt(i),i);
for(int j=i; j<pos; j++) result[j]=' ';
if(pos==-1) result[candidate.length()-1] = candidate.charAt(i);
else result[pos] = candidate.charAt(i);
}
return result;
}

static int find_position_for(char character, int offset)
{
character-=((int)'a');

for(int i=offset; i<size; i++)
{
// System.out.println("checking "+String.valueOf((char)(character+((int)'a')))+" at "+i);
if(!map[character][i])
{
map[character][i]=true;
return i;
}
}
return -1;
}

static String move_right(String a, int from)
{
return a.substring(0, from)+" "+a.substring(from);
}


static boolean taken(int character, int position)
{ return map[character][position]; }

static void take(char character, int position)
{
//System.out.println("taking "+String.valueOf(character)+" at "+position+" (char_index-"+(character-((int)'a'))+")");
map[character-((int)'a')][position]=true;
}

static int max_size(Stack stack)
{
int max=0;
while(!stack.isEmpty())
{
String s = stack.pop();
if(s.length()>max) max=s.length();
}

return max;
}

}

最佳答案

找到任何共同的超序列并不是一件难事:

在您的示例中,可能的解决方案如下:

公共(public)类 SuperSequenceTest {

public static void main(String[] args) {
String A = "caagccacctacatca";
String B = "cgagccatccgtaaagttg";
String C = "agaacctgctaaatgctaga";

int iA = 0;
int iB = 0;
int iC = 0;

char[] a = A.toCharArray();
char[] b = B.toCharArray();
char[] c = C.toCharArray();


StringBuilder sb = new StringBuilder();

while (iA < a.length || iB < b.length || iC < c.length) {
if (iA < a.length && iB < b.length && iC < c.length && (a[iA] == b[iB]) && (a[iA] == c[iC])) {
sb.append(a[iA]);
iA++;
iB++;
iC++;
}
else if (iA < a.length && iB < b.length && a[iA] == b[iB]) {
sb.append(a[iA]);
iA++;
iB++;
}
else if (iA < a.length && iC < c.length && a[iA] == c[iC]) {
sb.append(a[iA]);
iA++;
iC++;
}
else if (iB < b.length && iC < c.length && b[iB] == c[iC]) {
sb.append(b[iB]);
iB++;
iC++;
} else {
if (iC < c.length) {
sb.append(c[iC]);
iC++;
}
else if (iB < b.length) {
sb.append(b[iB]);
iB++;
} else if (iA < a.length) {
sb.append(a[iA]);
iA++;
}
}
}
System.out.println("SUPERSEQUENCE " + sb.toString());
}

然而,真正要解决的问题是找到最短公共(public)超序列已知问题的解决方案http://en.wikipedia.org/wiki/Shortest_common_supersequence ,这并不容易。

有很多研究与这个主题有关。

例如:

http://www.csd.uwo.ca/~lila/pdfs/Towards%20a%20DNA%20solution%20to%20the%20Shortest%20Common%20Superstring%20Problem.pdf

http://www.ncbi.nlm.nih.gov/pubmed/14534185

关于java - 寻找 DNA Java 的超序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19346536/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com