gpt4 book ai didi

python - Python 文本文件中的字典

转载 作者:塔克拉玛干 更新时间:2023-11-03 06:37:49 26 4
gpt4 key购买 nike

问题:

我有一个这种格式的 txt 文件:

Intestinal infectious diseases (001-003)  
001 Cholera
002 Fever
003 Salmonella
Zoonotic bacterial diseases (020-022)
020 Plague
021 Tularemia
022 Anthrax
External Cause Status (E000)
E000 External cause status
Activity (E001-E002)
E001 Activities involving x and y
E002 Other activities

以 3 整数代码/E+3 整数代码/V+3 整数代码开头的每一行都是前一个标题的值,它们是我的字典的键。在我看到的其他问题中,可以使用列或冒号来解析每一行以生成键/值对,但我的 txt 文件的格式不允许我这样做。

有没有一种方法可以将这样的 txt 文件制作成字典,其中键是组名,值是代码+疾病名称?

我还需要将代码和疾病名称解析到第二个字典中,所以我最终得到一个字典,其中包含组名作为键,值是第二个字典,代码作为键,疾病名称作为值。

脚本:

def process_file(filename):
myDict={}
f = open(filename, 'r')
for line in f:
if line[0] is not int:
if line.startswith("E"):
if line[1] is int:
line = dictionary1_values
else:
break
else:
line = dictionary1_key
myDict[dictionary1_key].append[line]

期望的输出格式是:
{“肠道传染病(001-003)”:{“001”:“霍乱”,“002”:“发烧”,“003”:“沙门氏菌”},“人畜共患细菌性疾病(020-022) ": {"020": "鼠疫", "021": "Tularemia", "022": "炭疽病"}, "外因状态(E000)": {"E000": "外因状态"}, "事件(E001-E002)”:{“E001”:“涉及x和y的事件”,“E002”:“其他事件”}}

最佳答案

def process_file(filename):
myDict = {}
rootkey = None
f = open(filename, 'r')
for line in f:
if line[1:3].isdigit(): # if the second and third character from the checked string (line) is the ASCII Code in range 0x30..0x39 ("0".."9"), i.e.: str.isdigit()
subkey, data = line.rstrip().split(" ",1) # split into two parts... the first one is the number with or without "E" at begin
myDict[rootkey][subkey] = data
else:
rootkey = line.rstrip() # str.rstrip() is used to delete newlines (or another so called "empty spaces")
myDict[rootkey] = {} # prepare a new empty rootkey into your myDict
f.close()
return myDict

在 Python 控制台中测试:

>>> d = process_file('/tmp/file.txt')
>>>
>>> d['Intestinal infectious diseases (001-003)']
{'003': 'Salmonella', '002': 'Fever', '001': 'Cholera'}
>>> d['Intestinal infectious diseases (001-003)']['002']
'Fever'
>>> d['Activity (E001-E002)']
{'E001': 'Activities involving x and y', 'E002': 'Other activities'}
>>> d['Activity (E001-E002)']['E001']
'Activities involving x and y'
>>>
>>> d
{'Activity (E001-E002)': {'E001': 'Activities involving x and y', 'E002': 'Other activities'}, 'External Cause Status (E000)': {'E000': 'External cause status'}, 'Intestinal infectious diseases (001-003)': {'003': 'Salmonella', '002': 'Fever', '001': 'Cholera'}, 'Zoonotic bacterial diseases (020-022)': {'021': 'Tularemia', '020': 'Plague', '022': 'Anthrax'}}

警告:文件中的第一行必须只是一个“rootkey”!不是“子键”或数据!否则原因可能是引发错误:-)

注意:也许您应该删除第一个“E”字符。还是做不到?您需要在某处留下这个“E”字符吗?

关于python - Python 文本文件中的字典,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55068753/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com