gpt4 book ai didi

python - 使用 if 和循环技术将清理后的数据附加到字典中

转载 作者:行者123 更新时间:2023-11-30 22:35:24 26 4
gpt4 key购买 nike

我有一个数据集需要清理和组织。这是数据集的链接

https://github.com/irJERAD/Intro-to-Data-Science-in-Python/blob/master/MyNotebooks/university_towns.txt

所以我想做的就是将此数据集清理到格式为 {State: Town) 的字典中,例如 {'Alabama': 'Auburn', Alabama: 'Florence'....'Wyoming' :“拉勒米”)

这是我的代码:

import re

univ_towns = open('university_towns.txt',encoding='utf-8').readlines()

state_list = []
d={}

for name in univ_towns:
if "[ed" in name:
statename = re.sub('\[edit]\n$', '', name)
state_list.append(statename)
len_state = len(state_list)
elif "(" in name:
sep = ' ('
townname = name.split(sep, 1)[0]
if "," in townname:
sep = ','
townname = townname.split(sep, 1)[0]
d[state_list[len_state-1]] = townname

d

但是,我的输出代码仅给出了字典中仅附加最后一个城镇的结果。我确信循环逻辑有问题,但我无法真正找出问题所在。这是我的代码的输出:

{'Alabama': 'Tuskegee',
'Alaska': 'Fairbanks',
'Arizona': 'Tucson',
'Arkansas': 'Searcy',
'California': 'Whittier',
'Colorado': 'Pueblo',
'Connecticut': 'Willimantic',
'Delaware': 'Newark',
'Florida': 'Tampa',
'Georgia': 'Young Harris',
'Hawaii': 'Manoa',
'Idaho': 'Rexburg',
'Illinois': 'Peoria',
'Indiana': 'West Lafayette',
'Iowa': 'Waverly',
'Kansas': 'Pittsburg',
'Kentucky': 'Wilmore',
'Louisiana': 'Thibodaux',
'Maine': 'Waterville',
'Maryland': 'Westminster',
'Massachusetts': 'Framingham',
'Michigan': 'Ypsilanti',
'Minnesota': 'Winona',
'Mississippi': 'Starkville',
'Missouri': 'Warrensburg',
'Montana': 'Missoula',
'Nebraska': 'Wayne',
'Nevada': 'Reno',
'New Hampshire': 'Rindge',
'New Jersey': 'West Long Branch',
'New Mexico': 'Silver City',
'New York': 'West Point',
'North Carolina': 'Winston-Salem',
'North Dakota': 'Grand Forks',
'Ohio': 'Wilberforce',
'Oklahoma': 'Weatherford',
'Oregon': 'Newberg',
'Pennsylvania': 'Williamsport',
'Rhode Island': 'Providence',
'South Carolina': 'Spartanburg',
'South Dakota': 'Vermillion',
'Tennessee': 'Sewanee',
'Texas': 'Waco',
'Utah': 'Ephraim',
'Vermont': 'Northfield',
'Virginia': 'Chesapeake',
'Washington': 'University District',
'West Virginia': 'West Liberty',
'Wisconsin': 'Whitewater',
'Wyoming': 'Laramie'}

最佳答案

尝试使用defaultdict:

from collections import defaultdict

d = defaultdict(list)

for name in univ_towns:
if "[ed" in name:
statename = re.sub('\[edit]\n$', '', name)
state_list.append(statename)
len_state = len(state_list)
elif "(" in name:
sep = ' ('
townname = name.split(sep, 1)[0]
if "," in townname:
sep = ','
townname = townname.split(sep, 1)[0]
d[state_list[len_state-1]].append(townname)

如您所见,唯一的主要区别在于最后使用 append 而不是 =。你以前的方式只会返回一个城市而不是所有城市,这似乎是你想要的,除非我误会了。

关于python - 使用 if 和循环技术将清理后的数据附加到字典中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44580456/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com