gpt4 book ai didi

python - 如何在不使用 Biopython 的情况下从 FASTA 文件获取此输出?

转载 作者:太空宇宙 更新时间:2023-11-03 21:24:36 25 4
gpt4 key购买 nike

我需要从 FASTA 文件获取如下所示的输出,但不使用 BioPython。大家有什么想法吗?

这是使用 BioPython 的代码:

from Bio import SeqIO
records = SeqIO.parse("data/assembledSeqs.fa", "fasta")
for i, seq_record in enumerate(records):
print("Sequence %d:" % i)
print("Number of A's: %d" % seq_record.seq.count("A"))
print("Number of C's: %d" % seq_record.seq.count("C"))
print("Number of G's: %d" % seq_record.seq.count("G"))
print("Number of T's: %d" % seq_record.seq.count("T"))
print()

FASTA 文件如下所示:

>chr12_9180206_+:chr12_118582391_+:a1;2 total_counts: 115 Seed: 4 K:    20 length: 79
TTGGTTTCGTGGTTTTGCAAAGTATTGGCCTCCACCGCTATGTCTGGCTGGTTTACGAGC
AGGACAGGCCGCTAAAGTG
>chr12_9180206_+:chr12_118582391_+:a2;2 total_counts: 135 Seed: 4 K: 20 length: 80
CTAACCCCCTACTTCCCAGACAGCTGCTCGTACAGTTTGGGCACATAGTCATCCCACTCG
GCCTGGTAACACGTGCCAGC
>chr1_8969882_-:chr1_568670_-:a1;113 total_counts: 7600 Seed: 225 K: 20 length: 86
CACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAAC
CAAACCACTTTCACCGCCACACGACC
>chr1_8969882_-:chr1_568670_-:a2;69 total_counts: 6987 Seed: 197 K: 20 length: 120
TGAACCTACGACTACACCGACTACGGCGGACTAATCTTCAACTCCTACATACTTCCCCCA
TTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAATCGAGTAGTACTCCCG

我需要获得以下输出:

Sequence 0:
Number of A's: 14
Number of C's: 17
Number of G's: 24
Number of T's: 24

Sequence 1:
Number of A's: 17
Number of C's: 30
Number of G's: 16
Number of T's: 17

Sequence 2:
Number of A's: 27
Number of C's: 31
Number of G's: 12
Number of T's: 16

Sequence 3:
Number of A's: 31
Number of C's: 41
Number of G's: 20
Number of T's: 28

我已经尝试过,但无法获得相同的输出。

def count_bases (fasta_file_name):
with open(fasta_file_name) as file_content:
for seqs in file_content:
if seqs.startswith('>'):
for i, seq in enumerate('>'):
print("Sequence %d:" % i)
else:
print("Number of A's: %d" % seqs.count("A"))
print("Number of C's: %d" % seqs.count("C"))
print("Number of G's: %d" % seqs.count("G"))
print("Number of T's: %d" % seqs.count("T"))
print()
return bases

result = count_bases('data/assembledSeqs.fa')

最佳答案

这些代码可以工作:

def count_bases (fasta_file_name):
sequece=''
def count():
if len(sequece):
print("Number of A's: %d" % sequece.count("A"))
print("Number of C's: %d" % sequece.count("C"))
print("Number of G's: %d" % sequece.count("G"))
print("Number of T's: %d" % sequece.count("T"))
print()
with open(fasta_file_name) as file_content:
i=0
for seqs in file_content:
if seqs.startswith('>'):
count()
print("Sequence %d:" % i)
i=i+1
sequece=''
else:
sequece=sequece+seqs.strip()
count()

result = count_bases('data/assembledSeqs.fa')

关于python - 如何在不使用 Biopython 的情况下从 FASTA 文件获取此输出?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53942135/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com