gpt4 book ai didi

python - 在 Python 中对 1M 记录进行排序的最佳方法

转载 作者:太空狗 更新时间:2023-10-29 21:20:30 26 4
gpt4 key购买 nike

我有一个运行的服务需要大约 1,000,000 个词典的列表并执行以下操作

myHashTable = {}
myLists = { 'hits':{}, 'misses':{}, 'total':{} }
sorted = { 'hits':[], 'misses':[], 'total':[] }
for item in myList:
id = item.pop('id')
myHashTable[id] = item
for k, v in item.iteritems():
myLists[k][id] = v

所以,如果我有以下字典列表:

[ {'id':'id1', 'hits':200, 'misses':300, 'total':400},
{'id':'id2', 'hits':300, 'misses':100, 'total':500},
{'id':'id3', 'hits':100, 'misses':400, 'total':600}
]

我结束了

myHashTable =
{
'id1': {'hits':200, 'misses':300, 'total':400},
'id2': {'hits':300, 'misses':100, 'total':500},
'id3': {'hits':100, 'misses':400, 'total':600}
}

myLists = 

{
'hits': {'id1':200, 'id2':300, 'id3':100},
'misses': {'id1':300, 'id2':100, 'id3':400},
'total': {'id1':400, 'id2':500, 'id3':600}
}

然后我需要对每个 myLists 字典中的所有数据进行排序。

我目前正在做的事情如下:

def doSort(key):
sorted[key] = sorted(myLists[key].items(), key=operator.itemgetter(1), reverse=True)

which would yield, in the case of misses:
[('id3', 400), ('id1', 300), ('id2', 200)]

当我有大约 100,000 条记录时,这很有效,但是对于 1,000,000 条记录,至少需要 5-10 分钟才能对总共 16 条记录进行排序(我的原始词典列表实际上有 17 个字段,包括 id,这是弹出)

* EDIT * This service is a ThreadingTCPServer which has a method allowing a client to connect and add new data. The new data may include new records (meaning dictionaries with unique 'id's to what is already in memory) or modified records (meaning the same 'id' with different data for the other key value pairs

So, once this is running I would pass in

[
{'id':'id1', 'hits':205, 'misses':305, 'total':480},
{'id':'id4', 'hits':30, 'misses':40, 'total':60},
{'id':'id5', 'hits':50, 'misses':90, 'total':20
]

I have been using dictionaries to store the data so that I don't end up with duplicates. After the dictionaries are updated with the new/modified data I resort each of them.

* END EDIT *

那么,我对这些进行排序的最佳方法是什么?有没有更好的方法?

最佳答案

您可以从 Guido 找到相关答案:Sorting a million 32-bit integers in 2MB of RAM using Python

关于python - 在 Python 中对 1M 记录进行排序的最佳方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1180240/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com