gpt4 book ai didi

bioinformatics - 如何从taxid获得界、门、纲、目、科、属和种的分类学特定ID?

转载 作者:行者123 更新时间:2023-12-04 06:56:44 25 4
gpt4 key购买 nike

我有一个看起来像这样的出租车列表:

1204725
2162
1300163
420247

我希望从上面的taxids中按顺序获得一个带有分类ID的文件:
kingdom_id      phylum_id       class_id        order_id        family_id       genus_id        species_id   

我正在使用包“ ete3”。我用的工具 ete-ncbiquery这会告诉您来自上述 id 的血统。 (我使用以下命令从我的 linux 笔记本电脑运行它)
ete3 ncbiquery --search 1204725 2162 13000163 420247 --info 

结果如下所示:
# Taxid Sci.Name    Rank    Named Lineage   Taxid Lineage
2162 Methanobacterium formicicum species root,cellular organisms,Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobacterium,Methanobacterium formicicum 1,131567,2157,28890,183925,2158,2159,2160,2162
1204725 Methanobacterium formicicum DSM 3637 no rank root,cellular organisms,Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobacterium,Methanobacterium formicicum,Methanobacterium formicicum DSM 3637 1,131567,2157,28890,183925,2158,2159,2160,2162,1204725
420247 Methanobrevibacter smithii ATCC 35061 no rank root,cellular organisms,Archaea,Euryarchaeota,Methanobacteria,Methanobacteriales,Methanobacteriaceae,Methanobrevibacter,Methanobrevibacter smithii,Methanobrevibacter smithii ATCC 350611,131567,2157,28890,183925,2158,2159,2172,2173,420247

我不知道哪些项目 (IDS) 对应于我要查找的内容(如果有)

最佳答案

以下代码:

import csv
from ete3 import NCBITaxa

ncbi = NCBITaxa()

def get_desired_ranks(taxid, desired_ranks):
lineage = ncbi.get_lineage(taxid)
lineage2ranks = ncbi.get_rank(lineage)
ranks2lineage = dict((rank, taxid) for (taxid, rank) in lineage2ranks.items())
return {'{}_id'.format(rank): ranks2lineage.get(rank, '<not present>') for rank in desired_ranks}

def main(taxids, desired_ranks, path):
with open(path, 'w') as csvfile:
fieldnames = ['{}_id'.format(rank) for rank in desired_ranks]
writer = csv.DictWriter(csvfile, delimiter='\t', fieldnames=fieldnames)
writer.writeheader()
for taxid in taxids:
writer.writerow(get_desired_ranks(taxid, desired_ranks))

if __name__ == '__main__':
taxids = [1204725, 2162, 1300163, 420247]
desired_ranks = ['kingdom', 'phylum', 'class', 'order', 'family', 'genus', 'species']
path = 'taxids.csv'
main(taxids, desired_ranks, path)

生成一个如下所示的文件:
kingdom_id  phylum_id   class_id    order_id    family_id   genus_id    species_id
<not present> 28890 183925 2158 2159 2160 2162
<not present> 28890 183925 2158 2159 2160 2162
<not present> 28890 183925 2158 2159 2160 2162
<not present> 28890 183925 2158 2159 2172 2173

关于bioinformatics - 如何从taxid获得界、门、纲、目、科、属和种的分类学特定ID?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36503042/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com