gpt4 book ai didi

c++ - 通过 StringSet 进行在线模式搜索

转载 作者:行者123 更新时间:2023-11-30 05:38:03 26 4
gpt4 key购买 nike

The SeqAn tutorial for Pattern Matching提到 StringSet 可以用作大海捞针或针。在尝试使用 StringSet 作为大海捞针时,

StringSet<Dna5String> seqs;

/* do stuff to load sequences into seqs */

Finder<StringSet<Dna5String> > finder(seqs);
Pattern<Dna5String, Simple> pattern(Dna5String("GAATTC"));

if (find(finder, pattern))
{
std::cout << '[' << beginPosition(finder) << ',' << endPosition(finder)
<< ")\t" << infix(finder) << std::endl;
} else
{
std::cout << "No match!";
}

我得到错误:

error: use of overloaded operator '==' is ambiguous (with operand types 'const const seqan::String, seqan::Alloc >' and 'const seqan::SimpleType')

有人知道应该如何正确完成这项工作吗?

Finder 中使用单个 Dna5String 效果很好。本教程确实展示了如何进行离线 搜索(即,使用索引),但这不是我想要的。如果 SeqAn 中的 Finder-Pattern 工具已经处理它,我宁愿不必手动迭代 StringSet

最佳答案

你可以试试,

#include <iostream>
#include <seqan/sequence.h> // CharString, ...
#include <seqan/find.h>
#include <seqan/stream.h>

using namespace seqan;

typedef Iterator<StringSet<Dna5String> >::Type TStringSetIterator;

int main(int, char const **)
{
StringSet<Dna5String> seqs;
Dna5String seq1 =
"TAGGTTTTCCGAAAAGGTAGCAACTTTACGTGATCAAACCTCTGACGGGGTTTTCCCCGTCGAAATTGGGTG"
"TTTCTTGTCTTGTTCTCACTTGGGGCATCTCCGTCAAGCCAAGAAAGTGCTCCCTGGATTCTGTTGCTAACG"
"AGTCTCCTCTGCATTCCTGCTTGACTGATTGGGCGGACGGGGTGTCCACCTGACGCTGAGTATCGCCGTCAC"
"GGTGCCACATGTCTTATCTATTCAGGGATCAGAATTCATTCAGGAAATCAGGAGATGCTACACTTGGGTTAT"
"CGAAGCTCCTTCCAAGGCGTAGCAAGGGCGACTGAGCGCGTAAGCTCTAGATCTCCTCGTGTTGCAACTACA"
"CGCGCGGGTCACTCGAAACACATAGTATGAACTTAACGACTGCTCGTACTGAACAATGCTGAGGCAGAAGAT"
"CGCAGACCAGGCATCCCACTGCTTGAAAAAACTATNNNNCTACCCGCCTTTTTATTATCTCATCAGATCAAG";
Dna5String seq2 =
"ACCGACGATTAGCTTTGTCCGAGTTACAACGGTTCAATAATACAAAGGATGGCATAAACCCATTTGTGTGAA"
"AGTGCCCATCACATTATGATTCTGTCTACTATGGTTAATTCCCAATATACTCTCGAAAAGAGGGTATGCTCC"
"CACGGCCATTTACGTCACTAAAAGATAAGATTGCTCAAANNNNNNNNNACTGCCAACTTGCTGGTAGCTTCA"
"GGGGTTGTCCACAGCGGGGGGTCGTATGCCTTTGTGGTATACCTTACTAGCCGCGCCATGGTGCCTAAGAAT"
"GAAGTAAAACAATTGATGTGAGACTCGACAGCCAGGCTTCGCGCTAAGGACGCAAAGAAATTCCCTACATCA"
"GACGGCCGCGNNNAACGATGCTATCGGTTAGGACATTGTGCCCTAGTATGTACATGCCTAATACAATTGGAT"
"CAAACGTTATTCCCACACACGGGTAGAAGAACNNNNATTACCCGTAGGCACTCCCCGATTCAAGTAGCCGCG";

clear(seqs);
appendValue(seqs, seq1);
appendValue(seqs, seq2);

Pattern<Dna5String, Simple> pattern(Dna5String("GAATTC"));

//For each sequence in seqs
for (TStringSetIterator it = begin(seqs); it != end(seqs); ++it)
{
std::cout << *it << std::endl;
//I create a finder for each sequence in seqs
Finder<Dna5String> finder(*it);
if (find(finder, pattern)){
std::cout << '[' << beginPosition(finder) << ',' << endPosition(finder)
<< ")\t" << infix(finder) << std::endl;
}else{
std::cout << "No match!" << std::endl;
}
}
return 0;
}

你得到:

TAGGTTTTCCGAAAAGGTAGCAACTTTACGTGATCAAACCTCTGACGGGGTTTTCCCCGTCGAAATTGGGTGTTTCTTGTCTTGTTCTCACTTGGGGCATCTCCGTCAAGCCAAGAAAGTGCTCCCTGGATTCTGTTGCTAACGAGTCTCCTCTGCATTCCTGCTTGACTGATTGGGCGGACGGGGTGTCCACCTGACGCTGAGTATCGCCGTCACGGTGCCACATGTCTTATCTATTCAGGGATCAGAATTCATTCAGGAAATCAGGAGATGCTACACTTGGGTTATCGAAGCTCCTTCCAAGGCGTAGCAAGGGCGACTGAGCGCGTAAGCTCTAGATCTCCTCGTGTTGCAACTACACGCGCGGGTCACTCGAAACACATAGTATGAACTTAACGACTGCTCGTACTGAACAATGCTGAGGCAGAAGATCGCAGACCAGGCATCCCACTGCTTGAAAAAACTATNNNNCTACCCGCCTTTTTATTATCTCATCAGATCAAG[247,253)   GAATTCACCGACGATTAGCTTTGTCCGAGTTACAACGGTTCAATAATACAAAGGATGGCATAAACCCATTTGTGTGAAAGTGCCCATCACATTATGATTCTGTCTACTATGGTTAATTCCCAATATACTCTCGAAAAGAGGGTATGCTCCCACGGCCATTTACGTCACTAAAAGATAAGATTGCTCAAANNNNNNNNNACTGCCAACTTGCTGGTAGCTTCAGGGGTTGTCCACAGCGGGGGGTCGTATGCCTTTGTGGTATACCTTACTAGCCGCGCCATGGTGCCTAAGAATGAAGTAAAACAATTGATGTGAGACTCGACAGCCAGGCTTCGCGCTAAGGACGCAAAGAAATTCCCTACATCAGACGGCCGCGNNNAACGATGCTATCGGTTAGGACATTGTGCCCTAGTATGTACATGCCTAATACAATTGGATCAAACGTTATTCCCACACACGGGTAGAAGAACNNNNATTACCCGTAGGCACTCCCCGATTCAAGTAGCCGCGNo match!

EDIT, I hope this help you

....
#include <seqan/index.h>
....

Pattern<Dna5String> pattern(Dna5String("GAATTC"));
Index< StringSet<Dna5String > > myIndex(seqs);
Finder< Index<StringSet<Dna5String > > > finder(myIndex);
while (find(finder, pattern)){
std::cout << '[' << beginPosition(finder) << ',' << endPosition(finder)
<< ")\t" << infix(finder) << std::endl;
}
....

你明白了,

[< 0 , 247 >,< 0 , 253 >)   GAATTC

关于c++ - 通过 StringSet 进行在线模式搜索,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32957614/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com