gpt4 book ai didi

python - 为什么使用字典测试包含比使用集合测试更快?

转载 作者:太空宇宙 更新时间:2023-11-03 15:57:38 25 4
gpt4 key购买 nike

我知道在本质上,Python 集和 Python 字典非常相似。阅读他们各自的来源 - dictset - 很明显他们的查找几乎相同。在阅读 this answer 时,我决定通过将以下模型组合在一起来测试作者的主张“集合查找比字典查找更快”:

from timeit import timeit
import random

universe = range(1,100000)
keys = random.sample(universe, 50000)
lookups = random.sample(universe, 50000)
dict_set = dict((k,True) for k in keys)
set_set = set(keys)

def dict_lookup():
for l in lookups:
l in dict_set

def set_lookup():
for l in lookups:
l in set_set

if __name__ == '__main__':
set_victories = 0
dict_victories = 0
for i in range(100):
dict_time = timeit('dict_lookup()', setup="from __main__ import dict_lookup", number=10000)
set_time = timeit('set_lookup()', setup="from __main__ import set_lookup", number=10000)
print("dict time: {}".format(dict_time))
print("set time: {}".format(set_time))
if set_time < dict_time:
set_victories += 1
else:
dict_victories += 1
print("Sets were faster in {} trials".format(set_victories))
print("Dicts were faster in {} trials".format(dict_victories))

预期的结果是,考虑到集合查找和字典查找的实现,它们的性能将没有区别。我实际发现的是以下最终结果:

$ python3 --version
Python 3.4.5
$ python3 sets-vs-dicts.py
<snip - see below for full output>
Sets were faster in 2 trials
Dicts were faster in 98 trials

所以字典实际上始终比集合更快。当然,我并不是建议我们都应该放弃集合并使用更快的字典,因为集合使程序员的意图更加清晰,并且考虑到测试的规模,差异小得可怜。然而,我确实发现这个结果非常奇怪。这是怎么回事?

如果您好奇,完整输出如下:

$ python3 set-vs-dict.py
dict time: 57.754860900342464
set time: 56.8056653002277
dict time: 50.8890880998224
set time: 50.642351899296045
dict time: 49.936297399923205
set time: 50.66272980067879
dict time: 49.92973940074444
set time: 50.65518939960748
dict time: 49.949383799917996
set time: 50.66877659969032
dict time: 49.93578719999641
set time: 50.64872649963945
dict time: 49.96432110015303
set time: 50.676835800521076
dict time: 49.95099350064993
set time: 50.64867010060698
dict time: 49.98275039996952
set time: 50.648987299762666
dict time: 49.92164439987391
set time: 50.66931669972837
dict time: 49.98953749984503
set time: 50.652459900826216
dict time: 49.95234560035169
set time: 50.65124330017716
dict time: 49.98174169939011
set time: 50.6712632002309
dict time: 49.93824000004679
set time: 50.65437529981136
dict time: 49.95089349988848
set time: 50.65370349958539
dict time: 49.963413699530065
set time: 50.65550949983299
dict time: 49.955208600498736
set time: 50.66121090017259
dict time: 49.94347499962896
set time: 50.64449250046164
dict time: 49.95420549996197
set time: 50.66687630023807
dict time: 49.92143050022423
set time: 50.64667259994894
dict time: 50.05037229973823
set time: 50.67966340016574
dict time: 49.93846719991416
set time: 50.64651320036501
dict time: 49.921281000599265
set time: 50.67906459979713
dict time: 49.942994699813426
set time: 50.65166569966823
dict time: 49.94313340075314
set time: 50.656177499331534
dict time: 49.94610709976405
set time: 50.65122799947858
dict time: 49.93874369934201
set time: 50.661101600155234
dict time: 49.94996269978583
set time: 50.63938449975103
dict time: 49.9602530002594
set time: 50.65474760066718
dict time: 49.91891669947654
set time: 50.663624899461865
dict time: 49.959330099634826
set time: 50.653377699665725
dict time: 49.98555530048907
set time: 50.64655719976872
dict time: 49.945239200256765
set time: 50.65128379967064
dict time: 49.95342260040343
set time: 50.65899199992418
dict time: 49.92802210059017
set time: 50.67100259941071
dict time: 49.942902400158346
set time: 50.74889140017331
dict time: 49.994800799526274
set time: 50.731577299535275
dict time: 49.98310230020434
set time: 50.747778999619186
dict time: 49.99376400001347
set time: 50.73122859932482
dict time: 50.00640409998596
set time: 50.68737949989736
dict time: 49.94556000083685
set time: 50.722481600008905
dict time: 49.98192979954183
set time: 50.72525530029088
dict time: 49.99698970001191
set time: 50.736096899956465
dict time: 49.94320739991963
set time: 50.71096289996058
dict time: 49.972679699771106
set time: 50.71838010009378
dict time: 49.957800599746406
set time: 50.747396499849856
dict time: 49.97235369961709
set time: 50.69941039942205
dict time: 49.951399500481784
set time: 50.647985899820924
dict time: 49.94027389958501
set time: 50.66828709933907
dict time: 49.94174600020051
set time: 50.65279300045222
dict time: 49.96716000046581
set time: 50.64943030010909
dict time: 49.95117200072855
set time: 50.65525580011308
dict time: 49.962328700348735
set time: 50.66319840028882
dict time: 49.960031100548804
set time: 50.672181099653244
dict time: 49.93908840045333
set time: 50.651302699930966
dict time: 49.94130470044911
set time: 50.655242399312556
dict time: 50.04310019966215
set time: 50.67391949985176
dict time: 49.93010629992932
set time: 50.64970660023391
dict time: 49.991717299446464
set time: 50.65591560024768
dict time: 49.952454400248826
set time: 50.649492600001395
dict time: 49.92677689995617
set time: 50.635977199301124
dict time: 49.95432769972831
set time: 50.64075019955635
dict time: 49.94808299932629
set time: 50.664196100085974
dict time: 49.966013699769974
set time: 50.649582100100815
dict time: 49.9813024001196
set time: 50.64982909988612
dict time: 49.93897459935397
set time: 50.66509110014886
dict time: 49.95878900028765
set time: 50.649003400467336
dict time: 49.96674569975585
set time: 50.69693780038506
dict time: 49.91303739976138
set time: 50.675189800560474
dict time: 49.950330699793994
set time: 50.64532170072198
dict time: 49.95022019930184
set time: 50.65448010060936
dict time: 49.95197269972414
set time: 50.65391890052706
dict time: 49.94361769966781
set time: 50.67086180020124
dict time: 49.95455109979957
set time: 50.670443600043654
dict time: 49.94633509963751
set time: 50.65955980028957
dict time: 49.967472000047565
set time: 50.66301089990884
dict time: 49.95830660033971
set time: 50.67482869978994
dict time: 49.984512499533594
set time: 50.67321899998933
dict time: 50.01141999941319
set time: 50.84260869957507
dict time: 50.31206789985299
set time: 51.02959220018238
dict time: 50.28449110034853
set time: 51.03110689949244
dict time: 50.303432799875736
set time: 51.02032170072198
dict time: 50.281682999804616
set time: 51.05188430007547
dict time: 50.30898350011557
set time: 51.01742030028254
dict time: 50.3027657000348
set time: 51.02114639990032
dict time: 50.00038649979979
set time: 50.65360379964113
dict time: 49.93306410033256
set time: 50.63413709960878
dict time: 49.95266539976001
set time: 50.65499630011618
dict time: 49.94854210037738
set time: 50.703547400422394
dict time: 49.96691229939461
set time: 50.69470370002091
dict time: 49.95223430078477
set time: 50.70982529968023
dict time: 49.954243999905884
set time: 50.791720499284565
dict time: 49.97948960028589
set time: 50.69436000008136
dict time: 49.98102519940585
set time: 50.73820179980248
dict time: 49.96782180014998
set time: 50.722959300503135
dict time: 49.9863857999444
set time: 50.70789400022477
dict time: 49.9592831004411
set time: 50.707397900521755
dict time: 49.94034240022302
set time: 50.667025099508464
dict time: 49.96215169969946
set time: 50.72984409984201
dict time: 49.98776920046657
set time: 50.72097889985889
Sets were faster in 2 trials
Dicts were faster in 98 trials

最佳答案

当我测试你的代码时,我认为这些数字可能有点小。所以我将它们增加了 10 倍,并让 random.sample 在 100 个数字中的比例为 1。

import random
from time import time


def timeit(func):
def wrap(*args):
start = time()
result = func(*args)
return time()-start
return wrap


def get_set_and_dict():
universe = range(1, 10**8)
keys = random.sample(universe, 10**6)
lookups = random.sample(universe,10**6)
dict_set = dict((k,True) for k in keys)
set_set = set(keys)
return dict_set, set_set, lookups


@timeit
def test(container, lookups):

for i in lookups:
a = i in container


def main():
dict_set, set_set, lookups = get_set_and_dict()
acc_set = acc_dict = 0
rounds = 100
for _ in range(rounds):
acc_dict += test(dict_set, lookups)
acc_set += test(set_set, lookups)
print("Set time: {:.4f}s\n Dict time: {:.4f}s".format(acc_set/rounds, acc_dict/rounds))

if __name__ == '__main__':
main()

>> Set time: 0.1263s
>> Dict time: 0.1578s

但是如果 set 和 dict 有所不同,那就有意义了,因为即使相似,它们也不是同一件事。

<小时/>

也许仅仅取决于您如何设置实验,结论就会有所不同。

关于python - 为什么使用字典测试包含比使用集合测试更快?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40633006/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com