gpt4 book ai didi

python - 如何根据其中一个子字符串对字符串进行分组?

转载 作者:行者123 更新时间:2023-12-04 14:59:15 28 4
gpt4 key购买 nike

我有以下列表 jargs

jargs = ['10192393\t15\t26\tskin tumour\tDiseaseClass\tD012878', 
'10192393\t443\t449\tcancer\tDiseaseClass\tD009369',
'10192393\t483\t496\tcolon cancers\tDiseaseClass\tD003110',
'10194428\t30\t45\themochromatosis\tModifier\tD016399',
'10194428\t102\t117\themochromatosis\tSpecificDisease\tD006432',
'10194428\t119\t145\tHereditary hemochromatosis\tSpecificDisease\tD006432',
'10194428\t147\t149\tHH\tDiseaseClass\tD006432']

我想写一个输出如下的程序:

ents = 
[
'10192393', {"entities":[(15, 26,"DiseaseClass"), (443, 449, "DiseaseClass"), (483, 496, "DiseaseClass")]},
'10194428', {"entities": [(30, 45, "Modifier"), (102, 117, "SpecificDisease"), (119, 145, "SpecificDisease"), (147, 149, "DiseaseClass")]}
]

我尝试了以下方法:

ents = [list(set([jargs[i].split('\t')[0] for i in range(len(jargs))]))[0],\
{"entities": [(int(jargs[i].split('\t')[1]), int(jargs[i].split('\t')[2]),\
jargs[i].split('\t')[-2]) for i in range(len(jargs))]}]

不幸的是,这段代码输出如下

['10194428',
{'entities': [('15', '26', 'DiseaseClass'),
('443', '449', 'DiseaseClass'),
('483', '496', 'DiseaseClass'),
('30', '45', 'Modifier'),
('102', '117', 'SpecificDisease'),
('119', '145', 'SpecificDisease'),
('147', '149', 'DiseaseClass')]}]

这不是预期的输出。

最佳答案

from pprint import pprint

tmp = {}
for item in jargs:
id_, v1, v2, _, v3, *_ = item.split("\t")
tmp.setdefault(id_, []).append((v1, v2, v3))

ents = []
for k, v in tmp.items():
ents.append(k)
ents.append({"entities": v})

pprint(ents)

打印:

['10192393',
{'entities': [('15', '26', 'DiseaseClass'),
('443', '449', 'DiseaseClass'),
('483', '496', 'DiseaseClass')]},
'10194428',
{'entities': [('30', '45', 'Modifier'),
('102', '117', 'SpecificDisease'),
('119', '145', 'SpecificDisease'),
('147', '149', 'DiseaseClass')]}]

关于python - 如何根据其中一个子字符串对字符串进行分组?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67269833/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com