gpt4 book ai didi

python - 将 fastq 信息复制到新的 fastq 文件中

转载 作者:太空宇宙 更新时间:2023-11-03 18:17:58 26 4
gpt4 key购买 nike

我正在尝试编写一段代码,该代码将打开一个fasta文件,并从另一个fastq文件中提取读取名称(标题)、序列(seq)和质量分数(qual),前提是在fasta文件中找到它,并将该 fastq 信息写入新的 fastq 文件中。但是,我在如何编写最后一部分方面遇到了麻烦(我在代码中遇到问题的地方用粗体显示)。有人可能知道如何编写这部分,或者我在哪里可以找到有关如何在 python 中输入此部分的信息?

到目前为止我已经:

from sys import argv
from Bio.SeqIO.QualityIO import FastqGeneralIterator

script, merged_seqs, raw_seqs = argv
merged_from_raw = "merged_only.fastq"

merged_names = set()
for line in open(merged_seqs):
if line[0] == ">":
read_name = line.split()[0][1:]
merged_names.add(read_name)

raw_fastq = raw_seqs

temp_handle = open(merged_from_raw, "w")
for title, seq, qual in FastqGeneralIterator(open(raw_fastq)) :
if title in merged_names:
**handle.write() #this is where I don't know how to write what I need in python**

最佳答案

除非您有特定原因要自己实现文件解析,否则最好使用 SeqIO 解析器来处理输入和输出文件。也许类似于以下内容(警告:我以前从未使用过 Bio,也没有测试过此代码):

from sys import argv
from Bio import SeqIO

output_filename = 'merged_only.fastq'
merged_seqs, raw_seqs = argv[1:2]

# Get fasta iterator, and read source fastq file into a dict-like object
merged_names = SeqIO.parse(merged_seqs, 'fasta')
source_seqs = SeqIO.index(raw_seqs, 'fastq')

filtered_seqs = (source_seqs[record.id] for record in merged_names if record.id in source_seqs)
SeqIO.write(filtered_seqs, output_filename, 'fastq')

关于python - 将 fastq 信息复制到新的 fastq 文件中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24688885/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com