gpt4 book ai didi

python - 将字符串附加到具有公共(public)列值的数据框中的列表时重复

转载 作者:行者123 更新时间:2023-12-04 03:26:16 24 4
gpt4 key购买 nike

这里是初学者,我试图根据我分配给他们的聚类值将街区名称与多伦多的数据框隔离开来。我最终得到了一个 2363 项长的列表,而不是包含 3 个唯一项的列表。

Neigh_List = []
for n in toronto_merged['Cluster Labels']:

if n == 7 :
x = toronto_merged['Neighborhood']
Neigh_List.append(x) if x not in Neigh_List else None




Neigh_List

[0 Parkwoods
1 Parkwoods
2 Victoria Village
3 Victoria Village
4 Victoria Village
...
2359 Mimico NW , The Queensway West , South of Bloor , Kingsway Park South West , Royal York South West
2360 Mimico NW , The Queensway West , South of Bloor , Kingsway Park South West , Royal York South West
2361 Mimico NW , The Queensway West , South of Bloor , Kingsway Park South West , Royal York South West
2362 Mimico NW , The Queensway West , South of Bloor , Kingsway Park South West , Royal York South West
2363 Mimico NW , The Queensway West , South of Bloor , Kingsway Park South West , Royal York South West
Name: Neighborhood, Length: 2364, dtype: object]

最佳答案

一般来说,对于较大的数据集(~1000+),应避免在 Pandas 数据帧上循环,因为 Pandas 内置矢量化函数通常更快(See this other stackoverflow post)。

你可以尝试这样的事情:

neigh_list = list(toronto_merged.loc[toronto_merged['Neighborhood'] == 7]]['Neighborhood'].unique())

另外,如果你想避免列表中的重复项,你可以使用 python sets (see 5.4 at the time of writing) .

unique_elements = set()
for x in some_iterable:
unique_elements.add(x)

或者,使用集合理解:

unique_elements = {unique_item for unique_item in some_iterable}

关于python - 将字符串附加到具有公共(public)列值的数据框中的列表时重复,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67524236/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com