gpt4 book ai didi

python - itertools.product 比嵌套 for 循环慢

转载 作者:太空狗 更新时间:2023-10-30 02:11:08 48 4
gpt4 key购买 nike

我正在尝试使用 itertools.product 函数使我的一段代码(在同位素模式模拟器中)更易于阅读并希望更快(documentation 声明没有中间结果是created) ,但是我已经使用 cProfiling 库相互测试了两个版本的代码,并注意到 itertools.product 比我的嵌套 for 循环慢得多。

用于测试的示例值:

carbons = [(0.0, 0.004613223957020534), (1.00335, 0.02494768843632857), (2.0067, 0.0673219412049374), (3.0100499999999997, 0.12087054681917497), (4.0134, 0.16243239687902825), (5.01675, 0.17427700732161705), (6.020099999999999, 0.15550695260604208), (7.0234499999999995, 0.11869556397525197), (8.0268, 0.07911287899598853), (9.030149999999999, 0.04677626606764402)]
hydrogens = [(0.0, 0.9417611429667746), (1.00628, 0.05651245007201512)]
nitrogens = [(0.0, 0.16148864310897554), (0.99703, 0.2949830688288726), (1.99406, 0.26887643366755537), (2.99109, 0.16305943261399866), (3.98812, 0.0740163089529218), (4.98515, 0.026824040474519875), (5.98218, 0.008084687617425748)]
oxygens17 = [(0.0, 0.8269292736927519), (1.00422, 0.15717628899143962), (2.00844, 0.014907548827832968)]
oxygens18 = [(0.0, 0.3584191873916266), (2.00425, 0.36813434247849824), (4.0085, 0.18867830334103902), (6.01275, 0.06433912182670033), (8.017, 0.016421642936302827)]
sulfurs33 = [(0.0, 0.02204843659673093), (0.99939, 0.08442569434459646), (1.99878, 0.16131398792444965), (2.99817, 0.2050722764666321), (3.99756, 0.1951327596407101), (4.99695, 0.14824112268069747), (5.99634, 0.09365899226198841), (6.99573, 0.050618028523695714), (7.99512, 0.023888506307006133), (8.99451, 0.010000884811585533)]
sulfurs34 = [(0.0, 3.0106350597190195e-10), (1.9958, 6.747270089956428e-09), (3.9916, 7.54568412614702e-08), (5.9874, 5.614443102700176e-07), (7.9832, 3.1268212758750728e-06), (9.979, 1.3903197959791067e-05), (11.9748, 5.141248916434075e-05), (13.970600000000001, 0.0001626288218672788), (15.9664, 0.00044921518047309414), (17.9622, 0.0011007203440032396)]
sulfurs36 = [(0.0, 0.904828368500412), (3.99501, 0.0905009370374487)]

演示嵌套 for 循环的代码段:

totals = []
for i in carbons:
for j in hydrogens:
for k in nitrogens:
for l in oxygens17:
for m in oxygens18:
for n in sulfurs33:
for o in sulfurs34:
for p in sulfurs36:
totals.append((i[0]+j[0]+k[0]+l[0]+m[0]+n[0]+o[0]+p[0], i[1]*j[1]*k[1]*l[1]*m[1]*n[1]*o[1]*p[1]))

演示 itertools.product 使用的片段:

totals = []
for i in itertools.product(carbons,hydrogens,nitrogens,oxygens17,oxygens18,sulfurs33,sulfurs34,sulfurs36):
massDiff = i[0][0]
chance = i[0][1]
for j in i[1:]:
massDiff += j[0]
chance = chance * j[1]
totals.append((massDiff,chance))

分析结果(基于每个方法运行 10 次)嵌套 for 循环方法的平均时间约为 0.8 秒,itertools.product 方法的平均时间为 1.3 秒。因此,我的问题是,我是在错误地使用 itertools.product 函数,还是应该坚持使用嵌套的 for 循环?

-- 更新--

我包含了我的两个 cProfile 结果:

# ITERTOOLS.PRODUCT APPROACH 
420003 function calls in 1.306 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.018 0.018 1.306 1.306 <string>:1(<module>)
1 1.246 1.246 1.289 1.289 IsotopeBas.py:64(option1)
420000 0.042 0.000 0.042 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

和:

# NESTED FOR LOOP APPROACH
420003 function calls in 0.830 seconds

Ordered by: standard name

ncalls tottime percall cumtime percall filename:lineno(function)
1 0.019 0.019 0.830 0.830 <string>:1(<module>)
1 0.769 0.769 0.811 0.811 IsotopeBas.py:78(option2)
420000 0.042 0.000 0.042 0.000 {method 'append' of 'list' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}

最佳答案

您的原始 itertool 代码在不必要的 lambda 中花费了大量额外时间,并手动构建中间值列表 - 其中很多可以用内置功能代替。

现在,内部 for 循环确实增加了很多额外的开销:只需尝试以下操作,性能与您的原始代码相当:

for a in itertools.product(carbons,hydrogens,nitrogens,oxygens17,
oxygens18,sulfurs33,sulfurs34,sulfurs36):
i, j, k, l, m, n, o, p = a
totals.append((i[0]+j[0]+k[0]+l[0]+m[0]+n[0]+o[0]+p[0],
i[1]*j[1]*k[1]*l[1]*m[1]*n[1]*o[1]*p[1]))

下面的代码尽量在CPython builtin端运行,我测试是和with代码等价的。值得注意的是,该代码使用 zip(*iterable) 来解压缩每个产品结果;然后使用 reduceoperator.mul 进行乘积,使用 sum 进行求和; 2 个用于遍历列表的生成器。 for 循环仍然略有跳动,但从长远来看,它被硬编码可能不是您可以使用的。

import itertools
from operator import mul
from functools import partial

prod = partial(reduce, mul)
elems = carbons, hydrogens, nitrogens, oxygens17, oxygens18, sulfurs33, sulfurs34, sulfurs36
p = itertools.product(*elems)

totals = [
( sum(massdiffs), prod(chances) )
for massdiffs, chances in
( zip(*i) for i in p )
]

关于python - itertools.product 比嵌套 for 循环慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24555457/

48 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com