gpt4 book ai didi

python - pandas read_csv 函数读取一列作为核苷酸序列的 NaN

转载 作者:太空宇宙 更新时间:2023-11-04 07:51:30 25 4
gpt4 key购买 nike

我正在尝试读取包含 ID 和序列的核苷酸序列文件。默认情况下,序列在 70 位核苷酸序列后用新行分隔。

输入文件 (seq.txt) 如下所示。

seqgb_AY741213_Organism_Influenza_A_virus__A_blackbird_Hunan_1_2004_H5N1___Strain_Name_A_blackbird_Hunan_1_2004_Segment_4_Subtype_H5N1_Host_Blackbird,
ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACC
ATGCAAACAACTCGACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTACACATGCTCAAGA
CGTACTGGACAAGACACACAACGGGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAA
AGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGATTCCTAGATGTCTGGACTTATAATGCTGAAC
TTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGAAAA
GGTCCGACTACAACTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCACAAATGT
GATAATGAATGTATGGAAAGTGTAAGAAACGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGAC
TAAACAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGTCAATTTATTC
AACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTATCTTTATGGATGTGCTCCAATGGA
TCGTTACAATGCAGAATTTGCATTTGA


seqgb_EU676325_Organism_Influenza_A_virus__A_brown-head_gull_Thailand_vsmu-4_2008_H5N1___Strain_Name_A_brown-head_gull_Thailand_vsmu-4_2008_Segment_4_Subtype_H5N1_Host_Brown-Headed_Gull,
TTTAGCAAAAGGCAGGGGTATATCTGTCAAAATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTT
GTTAAAAGTGATCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACACAATAATGG
AAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAAAAGACACACAACGGGAAGCTCTGCGATCT
AGATGGAGTGAAGCCTCTAATTTTGAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGT
GACGAATCTCCAATGGGGGCGATAAACTCTAGTATGCCATTCCACAATATACACCCTCTCACCATCGGGG
AATGCCCCAAATATGTGAAATCAAACAGATTAGTCCTTGCGACTGGGCTCAGAAATAGCCCTCAAAGAGA
GAGAAGAAGAAAAAAGAGAGGATTATTTGGAGCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATG
GTAGATGGTTGGTATGGGTACCACCATGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTC
ATGACTCAAATGTCAAGAACCTTTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGG
TAACGGTTGTTTCGAGTTCTATCATAAATGTGATAATGAATGTATGGAAAGTGTAAGAAACGGAACGTAT
GACTACCCACAGTATTCAGAAGAAGCAAGACTAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAA
TAGGAATTTACCAAATACTGTCAATTTATTCTACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGC
TGGTCTATCCTTATGGATGTGCTCCAATGGGTCGTTACAATGCAGAATTTGCATTTAAATTTGTGAGTTC
AGATTGAG


seqgb_EF178528_Organism_Influenza_A_virus__A_brown-headed_gull_Thailand_VSMU-28-SPK_2005_H5N1___Strain_Name_A_brown-headed_gull_Thailand_VSMU-28-SPK_2005_Segment_4_Subtype_H5N1_Host_Brown-Headed_Gull,
AGCAAAAGCAGGGGTATAATCTGTCAAAATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTT
AAAAGTGATCAGATTTGCATTGGTTACCATGCAAACAACTCGACAGAGCAGGTTGACACAATAATGGAAA
AGAACGTTACGAATGATGCAATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGAGTATGCATACAA
AATTGTCAAGAAAGGGGACTCAACAATTATGAAAAGTGAATTGGAATATGGTAACTGCAACACCAAGTGT
CAAACTCCAATGGGGGCGATAAACTCAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGC
CGTTGGAAGGGAATTTAACAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGTTC
CTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCCTGGAAAATGAGAGAACTCTAGACTTTCATG
ACTCAAATGTCAAGAACCTTTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAA
CGGTTGTTTCGAGTTCTATCATAAATGTGATAATGAATGTATGGAAAGTGTAAGAAACGGAACGTATGAC
TACCCACAGTATTCAGAAGAAGCAAGACTAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAG
GAATTTACCAAATACTGTCAATTTATTCTACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGG
TCTATCCTTATGGATGTGCTCCAATGGGTCGTTACAATGCAGAATTTGCATTTAAATTTGTGAGTTCAGA
T


seqgb_CY091790_Organism_Influenza_A_virus__A_chicken_Ampenan_BBVD-282_2007_H5N1___Strain_Name_A_chicken_Ampenan_BBVD-282_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCTGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATT
TGCATTGGTTACCATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTA
CACATGCCCAAGACATACTGGAAAAGGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTTAAGA
ACCTCTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATAATGAATGTATGGAAAGTATAAGAAACGGAACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATAC
TGTCGATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCGTTACAATGCAGAATTTGCATTTAAATTTGTGAGTTCAGATTGTAGTTAAA


seqgb_KT216634_Organism_Influenza_A_virus__A_chicken_Anhui_MG08_2008_H9N2___Strain_Name_A_chicken_Anhui_MG08_2008_Segment_4_Subtype_H9N2_Host_Chicken,
AGCAAAAGCAGGGGAATTTCACAACCACTCAAGATGGAGACAGTATCACTAATAAATATACTACTAGTAG
TAACAGTAAGCAATGCAGATAAAATCTGCATCGGCTATCAATCAACAAATTCCACAGAAACTGTAGACAC
ACTAACAGAAAACAATGTCCCTGTGATTGTAATTGCAATGGGGTTTGCTGCCTTCTTGTTCTGGGCCATG
TCCAATGGGTCTTGCAGATGCAACATTTGTATATAATTGGCAAAAACACCCTTGTTTCTACT


seqgb_KY005855_Organism_Influenza_A_virus__A_chicken_Anhui_MZ33_2016_H5N6___Strain_Name_A_chicken_Anhui_MZ33_2016_Segment_4_Subtype_H5N6_Host_Chicken,
ATGGAGAAAATAGTGCTTCTTCTTGCAGTGGTTAGCCTTGTTAAAGGTGATCAGATTTGCATTGGTTACC
ATGCAAACAACTCGACTGAGCAGGTTGACACGATAATGGAAAAAAACGTCACTGTTACACATGCTCAAGA
CATACTAGAAAGGAATATGGCAATTGCAACACCAAATGTCAAACTCCAATAGGGGCGATAAACTCTAGTA
TGCCATTCCACAATATACACCCTCTCACTATCGGGGAGTGCCCCAAATATGTGAAATCAAACAAATTAGT
CCTTGCGACTGGGCTCAGAAATAGTCGAATCCACCCAAAAGGCAATAGATGGAGTTACCAATAAGGTCAA
CTCGATAATTGACAAAATGAACACTCAGACGGATTCCTAGATGTCTGGACTTATAATGCTGAACTTTTAG
TTCTCATGGAAAATGAGAGAACTCTAGATTTCCATGACTCAAATGTCAAGAACCTTTATGACAAAGTCCG
ACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAATGGTTGTTTCGAGTTCTATCACAAATGTGATAAT
GAATGTATGGAAAGTGTGAGGAATGGGACGTATGACTACCCCCAGTATTCAGAAGAAGCAAGATTAAAAA
GGGAAGAAATAAGCGGAGTGAAATTGGAATCAATAGGAACTTACCAAATACTGTCAATTTATTCAACAGT
GGCGGGTTCCCTAGCACTGGCAATCATTGTGGCTGGTCTATCTTTATGGATGTGCTCCAATGGGTCGTTA
CAATGCAGAATTTGCATTTAA


seqgb_KY005863_Organism_Influenza_A_virus__A_chicken_Anhui_MZ34_2016_H5N6___Strain_Name_A_chicken_Anhui_MZ34_2016_Segment_4_Subtype_H5N6_Host_Chicken,
ATGGAGAAAAGAAGAACGATGCATACCCAACAATAAAAATGAGCTACAATAACACCAATAGGGAAGATCT
TTTGATACTGTGGGGGATTCATCATTCCAATAATGCAGAAGAGCAGACAAATCTCTATAAAAACCCAACC
ACCTATGTTTCCGTTGGGACATCAACATTAAACCAGAGAGTGGTGCCAAAAATAGCTACTAGATCCCAAG
TAAACGGGCAAAGTGGAAGAATGGATTTCTTCTGGACAATTTTAAAACCGGATGATGCAATCCACTTCGA
GAGTAATGGAAATTTTATTGCTCCAGACTATCGGGGAGTGCCCCAAATATGTGAAATCAAACAAATTAGT
CCTTGCGACTGGGCTCAGAAATAGTCCTCTAAGAGAAAGAAGAAGAAAAAGAGGATTATTTGGAGCCATA
GCAGGGTTTATAGAGGGAGGATGGCAAGGAATGGTAGATGGTTGGTATGGGTACCACCATAGCAATGCAC
AAGGGAGTGGGTATGCTGCAGACAGAGAATCCACCCAAAAGGCAATAGATGGAGTTACCAATAAGGTCAA
CTCGATAATTGACAAAATGAACACTCAATTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAACGGAGA
ATAGAGAATTTAAATAAGAAAATGGAAGACGGATTCCTAGATGTCTGGACTTATAATGCTGAACTTTTAG
TTCTCATGGAAAATGAGAGAACTCTAGATTTCCATGACTCAAATGTCAAGAACCTTTATGACAAAGTCCG
ACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAATGGTTGTTTCGAGTTCTATCACAAATGTGATAAT
GAATGTATGGAAAGTGTGAGGAATGGGACGTATGACTACCCCCAGTATTCAGAAGAAGCAAGATTAAAAA
GGGAAGAAATAAGCGGAGTGAAATTGGAATCAATAGGAACTTACCAAATACTGTCAATTTATTCAACAGT
GGCGGGTTCCCTAGCACTGGCAATCATTGTGGCTGGTCTATCTTTATGGATGTGCTCCAATGGGTCGTTA
CAATGCAGAATTTGCATTTAA


seqgb_CY091815_Organism_Influenza_A_virus__A_chicken_Badung_BBVD-277_2007_H5N1___Strain_Name_A_chicken_Badung_BBVD-277_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCTGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATT
TGCATTGGTTACCATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTA
CACATGCCCAAGACATACTGGAAAAGACACACAACGGGAAGCTCTGTGATCTAGATGGAGTGAAGCCTCT
AATTTTAAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGGAACCCAATGTGTGATGAATTCATCAATGTA
CCGGAATGGTCTTACATAGTGGAGAACAGGGGTGAGCTCAGCATGTCCATACCTGGGAACGCCCTCCTTT
TTTAGAAATGTGGTATGGCTTATCAAAAAGAACAGTACATACCCAACAATAAAAAGAAGCTACAATAATA
CCAACCAAGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCGGCAGAGCAAACGAGGCT
ATATCAAAATCCAATCACCTATATTTCCGTTGGGACATCAACACTGAACCAGAGATTGGTACCAAAAATA
GCTACCAGAACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGGGGAAATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTAGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCGTTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_CY091816_Organism_Influenza_A_virus__A_chicken_Badung_BBVD-288_2007_H5N1___Strain_Name_A_chicken_Badung_BBVD-288_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCCGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGCCAGTCTTGTTAAAGGTGATCAGATT
TGCATTGGTTACCATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTA
CACATGCCCAAGACATACTGGAAAAGGCACACAACGGGAAGCTCTGTGATCTAGATGGAGTGAAGCCTCT
AATTTTAAGAGATTGTAGTGTAGCCGGATGGCTCCTCGGGAACCCAATGTGTGACGAATTCATCAATGTA
CCGGAATGGTCTTACATAGTGGAGAACAGGGGTGAGCTCAGCATGTCCATACCTGGGAACGCCCTCCTTT
TTTAGAAATGTGGTATGGCTTATCAAAAAGAACAGTACATACCCAACAATAAAAAGAAGCTACAATAATA
CCAACCAGGAAGATCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCGGCTGAGCAAACGAAGCT
ATATCAAAATCCAACCACCTATATTTCCGTTGGGACATCAACACTAAATCAGAGATTGGTACCAAAAATA
GCTACTAGATCCAAAGTAAACGGACAAAGTGGAAGGATGGAGTTCTTCTGGACAATTTTAAAACCCAATG
ATGCAATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGAATATGCCTACAAAATTGTCAAGAAAGG
GGACTCAGCAATTATGAAAAGTGAATTGGAATATGGCAACTGCAACACCAAATGTCAAACTCCAATGGGG
GCGATAAACTTGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCATTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_CY091819_Organism_Influenza_A_virus__A_chicken_Badung_BBVD-328_2007_H5N1___Strain_Name_A_chicken_Badung_BBVD-328_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCTGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGCCAGTCTTGTTAAAGGTGATCAGATT
TGCATTGGTTACCATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTA
CACATGCCCAAGACATACTAGAAAAGGCACACAACGGGAAGCTCTGTGATCTAGATGGAGTGAAGCCTCT
AATTTTAAGAGATTGTAGTGTAGCCGAGCAGAATAAACCATTTTGAGAAAATTCAGATCATCCCCAAAAG
TTCTTGGTCCGACCATGAAGCCTCGTCAGGGGTGAGCTCAGCATGTCCATACCTGGGAACGCCCTCCTTT
TTTAGAAATGTGGTATGGCTTATCAAAAAGAACAGTACATACCCAACAATAAAAAGAAGCTACAATAATA
CCAACCAGGAAGATCTTTTGGTACTGTGGGGGATCCACCATCCTAATGATGCGGCTGAGCAAACGAAGCT
ATATCAAAATCCAACCACCTATATTTCCGTTGGGACATCAACACTAAATCAGAGATTGGTACCAAAAATA
GCTACTAGATCCAAAGTAAACGGACAAAGTGGAAGGATGGAGTTCTTCTGGACAATTTTAAAACCCAATG
ATGCAATCAACTTCGAGAGTAATGGAAATTTCATTGCTCCAGAATATGCCTACAAAATTGTCAAGAAAGG
GGACTCAGCAATTATGAAAAGTGAATTGGAATATGGCAACTGCAACACCAAATGTCAAACTCCAATGGGG
GCGATAAACTCTAGTATGCCATTCCACAACATACACCCTCTCACCATCGGGGAATGCCCCAAATATGTGA
AATCAAACAGATTAGTCCTTGCGACTGGGCTCAGAAATAGCCCCCAAAGAGAGAGAAGAAGAAAAAAGAG
AGGACTATTTGGAGCTATAGCAGGTTTTATAGAGGGTGGATGGCAGGGAATGGTAGATGGTTGGTATGGG
TACCACCATAGCAATGAGCAAGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATG
GAGTCACCAATAAGGTCAATTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATT
TAATAACTTAGAAAGGAGAATAGAGACTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAGATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATTTTTATGGAT
GTGCTCCAATGGATCATTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_CY091820_Organism_Influenza_A_virus__A_chicken_Badung_BBVD-342_2007_H5N1___Strain_Name_A_chicken_Badung_BBVD-342_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCCGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGCCAGTCTTGTTAAAGGTGATCAGATT
TGCATTGGTTACCATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTA
CACATGCCCAAGACATACTGGAAAAGGCACACAACGGGAAGCTCTGTGATCTAGATGGGGTGAAGCCTCT
AATTTTAAGAGATTGTAGTGTAGCCGTTATAGAGGGTGGATGGCAGGGAATGGTAGATGGTTGGTATGGG
TACCACCATAGCAATGAGCAAGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATG
GAGTCACCAATAAGGTCAACTCGATTATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATT
TAATAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTCCTAGATGTCTGGACT
TATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTTTAGACTTTCATGACTCAAATGTTAAGA
ACCTCTACGACAAAGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCATTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_GQ122391_Organism_Influenza_A_virus__A_chicken_Bali_UT2091_2005_H5N1___Strain_Name_A_chicken_Bali_UT2091_2005_Segment_4_Subtype_H5N1_Host_Chicken,
ATGGAGAAAATAGTGCTTCTTCTTGCAACAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACC
ATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTACACATGCCCAAGA
CATACTGGAAAAAACACACAACGGGAATGGCAGGGAATGGTAGATGGTTGGTATGGGTACCACCATAGCA
ATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAA
GGTCAACTCAATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAA
AGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTTCTAGATGTCTGGACTTATAATGCCGAAC
TTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTTAAGAACCTCTACGACAA
GGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCACAAATGT
GATAATGAATGTATGGAAAGTATAAGAAACGGAACGTATAACTACCCGCAGTATTCAGAAGAAGCAAGAT
TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGTCAATTTATTC
AACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGATGTGCTCCAATGGA
TCGTTACAATGCAGAATTTGCATTTAA


seqgb_GQ122392_Organism_Influenza_A_virus__A_chicken_Bali_UT2092_2005_H5N1___Strain_Name_A_chicken_Bali_UT2092_2005_Segment_4_Subtype_H5N1_Host_Chicken,
ATGGAGAAAATAGTGCTTCTTCTTGCAACAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACC
ATGCAAACAATTCAACAGAGCAGGTTGCCCTCAAAGAGAGAGAAGAAGAAAAAAGAGAGGACTATTTGGA
GCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTGGTATGGGTATCACCATAGCA
ATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAA
GGTCAACTCAATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAA
AGGAGAATAGAATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTTAAGAACCTCTACGACAA
GGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCACAAATGT
GATAATGAATGTATGGAAAGTATAAGAAACGGAACGTATAACTACCCGCAGTATTCAGAAGAAGCAAGAT
TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTACCAAATACTGTCAATTTATTC
AACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGATGTGCTCCAATGGA
TCGTTACAATGCAGAATTTGCATTTAA


seqgb_DQ083551_Organism_Influenza_A_virus__A_chicken_Bangkok_Thailand_CU-3_04_H5N1___Strain_Name_A_chicken_Bangkok_Thailand_CU-3_04_Segment_4_Subtype_H5N1_Host_Chicken,
ATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACC
ATGCAAACAACTCGACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTACACATGCCCAAGA
CATACTGGAAAAGACTTTCATTGCTCCAGAATATGCATACAAAATTGTCAAGAAAGGGGACTCAACAATT
ATGAAAAGTGAATTGGAATATGGTAAATGGCAGGGAATGGTAGATGGTTGGTATGGGTACCACCATAGCA
ATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAA
GGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAACAACTTAGAA
AGGAGAATAGAAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCATAAATGT
GATAATGAATGTATGGAAAGTGTAAGAAACGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAGAC
TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAATTTACCAAATACTGTCAATTTATTC
TACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTATCCTTATGGATGTGCTCCAATGGG
TCGTTACAATGCAGAATTTGCATTTAAATTTG


seqgb_CY091797_Organism_Influenza_A_virus__A_chicken_Bangli_BBVD-245_2007_H5N1___Strain_Name_A_chicken_Bangli_BBVD-245_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCTGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGCCAGTCTTGTTAAAGGTGATCAGATT
TGCATTGGTTACCATGCAAACAATTCAACAGAGCAGGTTGACACAATAATGGAAAAGAACGTTACTGTTA
CACATGCCCAATTAGTCCTTGCGACTATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATT
TAATAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTCCTAGATGTCTGGACT
TATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTTTAGACTTTCATGACTCAAATGTTAAGA
ACCTCTACGACAAAGTCCGACTACAGCTTAGGGATAATGCAAAGGAGTTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCATTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_CY091801_Organism_Influenza_A_virus__A_chicken_Bangli_BBVD-562_2007_H5N1___Strain_Name_A_chicken_Bangli_BBVD-562_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCTGTCATTCGAGAGTAATGGAGGGCTCAGAAATAGCCCCCAAAGAGAGAGAAGAAGAAAAAAGAG
AGGACTATTTGGAGCTATAGCAGGTTTTATAGAGGGTGGATGGCAGGGAATGGTAGATGGTTGGTATGGG
TACCACCATAGCAATGAGCAAGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAAATG
GAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATT
TAATAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTCCTAGATGTCTGGACT
TATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTTTAGACTTTCATGACTCAAATGTTAAGA
ACCTCTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAGATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATTTTTATGGAT
GTGCTCCAATGGATCATTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_CY091803_Organism_Influenza_A_virus__A_chicken_Bangli_BBVD-575_2007_H5N1___Strain_Name_A_chicken_Bangli_BBVD-575_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCCGTCAGAGCTATAGCAGGTTTTATAGAGGGTGGATGGCAGGGAATGGTAGATGGTTGGTATGGG
TACCACCATAGCAATGAGCAAGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATG
GAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATT
TAATAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTCTTAGATGTCTGGACT
TATAATGCTGAGCTTCTGGTTCTCATGGAAAATGAGAGAACTTTAGACTTTCATGACTCAAATGTTAAGA
ACCTCTACGACAAAGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCATTACAGTGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_GQ122399_Organism_Influenza_A_virus__A_chicken_Banten_UT6025_2006_H5N1___Strain_Name_A_chicken_Banten_UT6025_2006_Segment_4_Subtype_H5N1_Host_Chicken,
ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACC
ATGCAAACAATCAGGGCTCAGAAAGGATGGCAGGGAATGGTAGATGGTTGGTATGGGTACCATCATAGCA
ATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAA
GGTCAACTCAATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAATAACTTAGAA
AGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTTCTAGATGTCTGGACTTATAATGCCGAAC
TTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTTAAGAACCTCTATGACAA
GGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCACAAATGT
GATAATGGATGTATGGAAAGTATAAGAAACGGAACGTATAACTACCCGCAGTATTCAGAAGAAGCAAGAT
TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAACTTATCAAATACTGTCAATTTATTC
AACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGATGTGTTCCAATGGA
TCGTTACAATGCAGAATTTGCATTTAA


seqgb_CY091789_Organism_Influenza_A_virus__A_chicken_Buleleng_BBVD-545b_2007_H5N1___Strain_Name_A_chicken_Buleleng_BBVD-545b_2007_Segment_4_Subtype_H5N1_Host_Chicken,
TCAATCCGTCAAAATGGAGAAAATAGTGCTTCTTCTTGCAATAGCCAGTCTTGTTAAAGGTGATCAGATT
TGCATTGGTTACCATGAAAAGTGAATTGGAATATGGCAACTGCAACACCAAATGTCAAACTCCAATGGGG
GCGATAAACTCTAGTATGCCATTCCATGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATG
GAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATT
TAATAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGATTCCTAGATGTCTGGACT
TATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTTAAGA
ACCTCTACGACAAAGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTT
CTATCACAAATGTGATGATGAATGTATGGAAAGTGTAAGAAATGGGACGTATAACTACCCGCAGTATTCA
GAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGGGTAAAATTGGAATCAATAGGAATTTACCAAATAC
TGTCAATTTATTCAACAGTGGCGAGTTCCCTAGCACTGGCAATCATGATGGCTGGTCTATCTTTATGGAT
GTGCTCCAATGGATCATTACAATGCAGAATTTGCATTTAAATTTGTGAGTTTAGATTGTAGTTAAA


seqgb_HQ200590_Organism_Influenza_A_virus__A_chicken_Cambodia_047LC3_2005_H5N1___Strain_Name_A_chicken_Cambodia_047LC3_2005_Segment_4_Subtype_H5N1_Host_Chicken,
AGCAAAAGCAGGGGTTTAATCTGTCAAAATGGAGAAAATAGTGCTTCTTTTTGCGATAGTCAGTCTTGTT
AAAAGTGATCAGATGGGACTCAACAATTATGAAAAGTGAATTGGAATATGGTAACTGCAACACCAAGTGT
CAAACTCCAATGGGGGCGATAAACTCCAATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTC
AAAAGGCTATAGATGGAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGC
CGTTGGAAGGGAATTTAACAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGTTC
CTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTCCATG
ACTCAAATGTCAAGAACCTTTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAA
CGGTTGTTTCGAGTTCTATCACAAATGTGATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGAC
TACCCGCAGTATTCAGAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAG
GAATTTACCAAATACTGTCAATTTATTCTACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGG
TCTATCCTTATGGATGTGCTCCAATGGGTCGTTACAATGCAGAATTTGCATTTAAATTTGTGAGTTCAGA
TTGTAGTTAAAAACACCCTTGTTTCTACT


seqgb_HQ200554_Organism_Influenza_A_virus__A_chicken_Cambodia_047LC3b_2005_H5N1___Strain_Name_A_chicken_Cambodia_047LC3b_2005_Segment_4_Subtype_H5N1_Host_Chicken,
AGCAAAAGCAGGGGTTTAATCTGTCAAAATGGAGAAAATAGTGCTTCTTTTTGCGATAGTCAGTCTTGTT
AAAAGTGATCAGATTTGCATTGGTTACCATGCAAACAACTCAACAGAGCAGGTTGACACAATAATGGAAA
AGAACGTTACTGTTACACATGCCCAAGACATACTGGAAAAGACACATAACGGGAAGCTCTGCGATCTAGA
TGGAGTGAAGCCTCTAATTTTGAGAGATTGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTGAC
GAATTCATCAATGTGCCGGAATGGTCGAGCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTA
GATGGTTGGTATGGGTACCACCATAGCAATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTC
AAAAGGCTATAGATGGAGTCACCAATAAGGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGC
CGTTGGAAGGGAATTTAACAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGTTC
CTAGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTCCATG
ACTCAAATGTCAAGAACCTTTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAA
CGGTTGTTTCGAGTTCTATCACAAATGTGATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGAC
TACCCGCAGTATTCAGAAGAAGCAAGATTAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAG
GAATTTACCAAATACTGTCAATTTATTCTACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGG
TCTATCCTTATGGATGTGCTCCAATGGGTCGTTACAATGCAGAATTTGCATTTAAATTTGTGAGTTCAGA
TTGTAGTTAAAAACACCCTTGTTTCTACT

seqgb_EU620652_Organism_Influenza_A_virus__A_chicken_Thailand_NS-339_2008_H5N1___Strain_Name_A_chicken_Thailand_NS-339_2008_Segment_4_Subtype_H5N1_Host_Chicken,
AGCAAAAGCAGGGGTCTGATCTGTCAAAATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTT
AAAAGTGATCAAATTTGCATTGGTATAAGGTCAACTCGATAATTGACAAAATGAACACTCAGTTTGAGGC
CGTTGGAAGGGAATTTAACAACTTAGAAAGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGTTC
CTGGATGTCTGGACTTATAATGCTGAACTTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATG
ACTCAAATGTCAAGAACCTTTACGACAAGGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAA
CGGCTGTTTCGAGTTCTATCATAAATGTGATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGAC
TACCCGCAGTATTCAGAAGAAGCAAAACTAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAG
GAATTTACCAAATACTGTCAATTTATTCTACAGTGGCAAGTTCCCTAGCACTGGCAATCATGGTAGCTGG
TCTATCCTTATGGATGTGCTCCAATGGGTCATTACAATGCAGAATTTGCATTAAATTGGAGTCA


seqgb_EU850416_Organism_Influenza_A_virus__A_chicken_Thailand_NS-341_2008_H5N1___Strain_Name_A_chicken_Thailand_NS-341_2008_Segment_4_Subtype_H5N1_Host_Chicken,
ATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTTAAAAGTGATCAGATTTGCATTGGTTACC
ATGCAAACAACTCGACAGAGCAGGTTCTCACCATCGGGGAATGCCCCAAATATGTGAAATCAAATAGATT
AGTCCTTGCGACTGGGCTCAGAAATAGCCCTCAAAGAGAGAGAAGAAGAAAAAAGAGAGGATTATTTGGA
GCTATAGCAGGTTTTATAGAGGGAGGATGGCAGGGAATGGTAGATGGTTGGTATGGGTACCACCATAGCA
ATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAAGGCAATAGATGGAGTCACCAATAA
GGTCAACTCGATAATTGACAAAATGAACACTCAGTTTGAGGCCGTTGGAAGGGAATTTAACMACTTAGAA
AGGAGGATAGAGAATTTAAACAAGAAGATGGAAGACGGGTTCCTAGATGTCTGGACTTATAATGCTGAAC
TTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTCAAATGTCAAGAACCTTTACGACAA
GGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGCTGTTTCGAGTTCTATCATAAATGT
GATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGACTACCCGCAATATTCAGAAGAAGCAAAAC
TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAATTTACCAAATACTGTCAATTTATTC
TACAGTGGCAAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTATCCTTATGGATGTGCTCCAATGGG
TCATTACAATGCAGAATTTGCATTTAAATTG


seqgb_DQ999880_Organism_Influenza_A_virus__A_chicken_Thailand_PC-168_2006_H5N1___Strain_Name_A_chicken_Thailand_PC-168_2006_Segment_4_Subtype_H5N1_Host_Chicken,
ATGGAGAGAATAGTGCAGGGATAATGCAAAGGAGCTGGGTAACGGTTGTTTCGAGTTCTATCATAAGTGT
GATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGACTACCCGCAGTATTCAGAAGAAGCAAAAC
TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAATTTACCAAATACTGTCAATTTATTC
TACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTATCCTTATGGATGTGCTCCAATGGG
TCGTTACAATGCAGAATTTGCATTAAATTG

我写了这段代码:

import pandas as pd
import numpy as np
data = pd.read_csv('seq.txt', sep=',',delim_whitespace = True, names=["id", "seq"], skip_blank_lines = True, index_col=False) # , dtype='unicode'
dataframe = pd.DataFrame(data)
print(dataframe)

输出是:

                                                    id  seq
0 seqgb_AY741213_Organism_Influenza_A_virus__A_b... NaN
1 ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAA... NaN
2 ATGCAAACAACTCGACAGAGCAGGTTGACACAATAATGGAAAAGAA... NaN
3 CGTACTGGACAAGACACACAACGGGAAGCTCTGCGAGCTAGATGGA... NaN
4 TGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTGACGAAT... NaN
5 ACATAGTAGAGAAGGCCAGTCCAGCCAATGACCTCTGTTACCCAGG... NaN
6 GAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATC... NaN
7 CATGAAGCCTCATCAGGGGTGAGCTCAGCATGTCCATACCAGGGGA... NaN
8 TATGGCTTATCAAAAAGAACAGTGCATACCCAACAATAAAGAGGAG... NaN
9 TCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCGGCAGAG... NaN
10 ACCACCTATATTTCCGTTGGAACATCAACACTAAACCAGAGATTGG... NaN
11 AAGTAAATGGGCAAAGTGGAAGAATGGAGTTCTTCTGGACAATTTT... NaN
12 CGAGAGTAATGGAAATTTCATTGCTCCAGAATATGCATACAAAATT... NaN
13 ATGAAAAGTGAATTGGAATATGGTAACTGCAACACCAAGTGTCAAA... NaN
14 GTATGCCATTCCACAACATACACCCTCTCACCATCGGGGAATGCCC... NaN
15 AGTCCTTGCGACAGGGCTCAGAAATAGCCCTCAAAGAGAGAGAAGA... NaN
16 GCTATAGCAGGGTTTATAGAGGGAGGATGGCAGGGAATGGTAGATG... NaN
17 ATGAGCAGGGGAGTGGATACGCTGCAGACAAAGAATCCACTCAAAA... NaN
18 GGTCAACTCGATCATTGACAAAATGAACACTCAGTTTGAGGCCGTT... NaN
19 AGGAGAATAGAAAATTTAAACAAGAAGATGGAGGACGGATTCCTAG... NaN
20 TTCTGGTTCTCATGGAAAATGAGAGAACTCTAGACTTTCATGACTC... NaN
21 GGTCCGACTACAACTTAGGGATAATGCAAAGGAGCTGGGTAACGGT... NaN
22 GATAATGAATGTATGGAAAGTGTAAGAAACGGAACGTATGACTACC... NaN
23 TAAACAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAAC... NaN
24 AACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTA... NaN
25 TCGTTACAATGCAGAATTTGCATTTGA NaN
26 seqgb_EU676325_Organism_Influenza_A_virus__A_b... NaN
27 TTTAGCAAAAGGCAGGGGTATATCTGTCAAAATGGAGAAAATAGTG... NaN
28 GTTAAAAGTGATCAGATTTGCATTGGTTACCATGCAAACAACTCGA... NaN
29 AAAAGAACGTTACTGTTACACATGCCCAAGACATACTGGAAAAGAC... NaN
.. ... ...
598 GATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGACTACC... NaN
599 TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAAT... NaN
600 TACAGTGGCAAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTA... NaN
601 TCATTACAATGCAGAATTTGCATTTAAATTG NaN
602 seqgb_DQ999880_Organism_Influenza_A_virus__A_c... NaN
603 ATGGAGAGAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTTAAAA... NaN
604 ATGCAAACAACTCGACAGAGCAGGTTGACACAATAATGGAAAGGAA... NaN
605 CATACTGGAAAAGACACACAACGGGAAGCTCTGCGATCTAGATGGA... NaN
606 TGTAGTGTAGCTGGATGGCTCCTCGGAAACCCAATGTGTGACGAAT... NaN
607 ACATAGTGGAGAAGGCCAATCCAGTCAATGACCTCTGTTACCCAGG... NaN
608 GAAACACCTATTGAGCAGAATAAACCATTTTGAGAAAATTCAGATC... NaN
609 CATGAAGCCTCATTAGGGGTGAGCTCAGCATGTCCATACCTGGGAA... NaN
610 TATGGCTTATCAAAAAGAACAGTACATACCCAACAATAAAGAGGAG... NaN
611 TCTTTTGGTACTGTGGGGGATTCACCATCCTAATGATGCGGCAGAG... NaN
612 ACCACCTATATTTCTGTTGGGACATCAACACTAAACCAGAGATTGG... NaN
613 AAGTAAACGGGCAAAGTGGAAGGATGGAGTTCTTCTGGACAATTTT... NaN
614 CGAGAGTAATGGAAATTTCATTGCTCCAGAATATGCATACAAAATT... NaN
615 ATGAAAAGTGAATTGGAATATGGTAACTGCAACACCAAGTGTCAAA... NaN
616 GTATGCCATTCCACAATATACACCCTCTCACTATCGGGGAATGCCC... NaN
617 AGTCCTTGCGACTGGGCTCAGAAATAGCCCTCAAAGAGAGAGAAGA... NaN
618 GCTATAGCAGGTTTTATAGAGGGGGGATGGCAGGGAATGGTAGATG... NaN
619 ATGAGCAGGGGAGTGGGTACGCTGCAGACAAAGAATCCACTCAAAA... NaN
620 GGTCAACTCGATAATTGACAAAATGAACACTCAGTTTGAGGCCGTT... NaN
621 AGGAGAATAGAGAATTTAAACAAGAAGATGGAAGACGGGTTCCTAG... NaN
622 TTCTGGTTCTCATGGAAAATGAGAGAACCCTAGACTTTCATGACTC... NaN
623 GGTCCGACTACAGCTTAGGGATAATGCAAAGGAGCTGGGTAACGGT... NaN
624 GATAATGAATGTATGGAAAGTGTGAGAAACGGAACGTATGACTACC... NaN
625 TAAAAAGAGAGGAAATAAGTGGAGTAAAATTGGAATCAATAGGAAT... NaN
626 TACAGTGGCGAGTTCCCTAGCACTGGCAATCATGGTAGCTGGTCTA... NaN
627 TCGTTACAATGCAGAATTTGCATTAAATTG NaN

[628 rows x 2 columns]

如何使用 pandas 去除单个序列之间的新行。提前致谢!!

最佳答案

几乎根据定义,换行符是 CSV 文件的重要组成部分,因此无法让 Pandas 的 read_csv 忽略它们。最好是手动删除换行符,如下所示:

import pandas as pd
import re

with open ("seq.txt", "r") as myfile:
data=myfile.readlines()

data = re.sub('\n', '', ''.join(data))
data = data.split(',')
df = pd.DataFrame([data], names=["id", "seq"])

关于python - pandas read_csv 函数读取一列作为核苷酸序列的 NaN,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54040741/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com