gpt4 book ai didi

python - 读取多个blast文件(biopython)

转载 作者:行者123 更新时间:2023-11-30 23:37:34 30 4
gpt4 key购买 nike

我正在尝试读取通过向 NCBIblast 网站提交多个序列而生成的 XML 文件列表。我想从每个文件中打印某些信息行。我想要读取的文件都带有后缀“_recombination.xml”

for file in glob.glob("*_recombination.xml"):
result_handle= open(file)
blast_record=NCBIXML.read(result_handle)
for alignment in blast_record.alignments:
for hsp in alignment.hsps:
print "*****Alignment****"
print "sequence:", alignment.title
print "length:", alignment.length
print "e-value:", hsp.expect
print hsp.query
print hsp.match
print hsp.sbjct

该脚本首先找到所有带有 "_recombination.xml" 后缀的文件,然后我希望它读取每个文件,并打印某些行(这几乎是 BioPython 的直接副本) cooking 书),似乎确实如此。但我收到以下错误:

Traceback (most recent call last):
File "Scripts/blast_test.py", line 202, in <module>
blast_record=NCBIXML.read(result_handle)
File "/Library/Python/2.7/site-packages/Bio/Blast/NCBIXML.py", line 576, in read
first = iterator.next()
File "/Library/Python/2.7/site-packages/Bio/Blast/NCBIXML.py", line 643, in parse
expat_parser.Parse("", True) # End of XML record
xml.parsers.expat.ExpatError: no element found: line 3106, column 7594

我不太确定问题是什么。我不确定它是否试图循环遍历已读取的文件 - 例如,关闭文件似乎有帮助:

for file in glob.glob("*_recombination.xml"):
result_handle= open(file)
blast_record=NCBIXML.read(result_handle)
for alignment in blast_record.alignments:
for hsp in alignment.hsps:
print "*****Alignment****"
print "sequence:", alignment.title
print "length:", alignment.length
print "e-value:", hsp.expect
print hsp.query
print hsp.match
print hsp.sbjct
result_handle.close()
blast_record.close()

但它也给了我另一个错误:

Traceback (most recent call last): 
File "Scripts/blast_test.py", line 213, in <module> blast_record.close()
AttributeError: 'Blast' object has no attribute 'close'

最佳答案

我通常使用parse方法而不是read方法,也许它可以帮助你:

for blast_record in NCBIXML.parse(open(input_xml)):
for alignment in blast_record.alignments:
for hsp in alignment.hsps:
print "*****Alignment****"
print "sequence:", alignment.title
print "length:", alignment.length
print "e-value:", hsp.expect
print hsp.query
print hsp.match
print hsp.sbjct

并确保您的 xml 是在查询爆炸中使用 -outfmt 5 生成的

关于python - 读取多个blast文件(biopython),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15406046/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com