gpt4 book ai didi

这段代码中的 Python 优化?

转载 作者:太空狗 更新时间:2023-10-30 01:40:49 26 4
gpt4 key购买 nike

我有两个相当简单的代码片段,我同时运行了它们很多次;我正在尝试确定是否可以进行任何优化来加快执行时间。如果有任何突出的事情可以更快地完成......

在第一个中,我们有一个列表,字段。我们还有一个列表,权重列表。我们试图找到哪个权重列表乘以字段将产生最大总和。字段大约有 30k 个条目。

def find_best(weights,fields):
winner = -1
best = -float('inf')
for c in range(num_category):
score = 0
for i in range(num_fields):
score += float(fields[i]) * weights[c][i]
if score > best:
best = score
winner = c
return winner

在第二个中,我们尝试更新我们的两个体重列表;一个增加一个减少。增加/减少每个元素的量等于字段中的相应元素(例如,如果 fields[4] = 10.5,那么我们希望将 weights[toincrease][4] 增加 10.5 并减少 weights[todecrease][4 ] 10.5)

 def update_weights(weights,fields,toincrease,todecrease):
for i in range(num_fields):
update = float(fields[i])
weights[toincrease][i] += update
weights[todecrease][i] -= update
return weights

我希望这不是一个过于具体的问题。

最佳答案

当您尝试优化时,您必须做的事情是剖析和衡量! Python 提供了 timeit 模块,让测量变得简单!

这将假定您事先将字段转换为 float 列表(在任何这些函数之外),因为字符串 → float 转换非常慢。您可以通过 fields = [float(f) for f in string_fields] 执行此操作。

此外,对于进行数值处理,纯 Python 不是很好,因为它最终会为每个操作进行大量类型检查(和其他一些事情)。使用像 numpy 这样的 C 库将带来巨大的改进。

找到最好的

我已将其他人(以及更多)的答案合并到一个分析套件中(例如,test_find_best.py):

import random, operator, numpy as np, itertools, timeit

fields = [random.random() for _ in range(3000)]
fields_string = [str(field) for field in fields]
weights = [[random.random() for _ in range(3000)] for c in range(100)]

npw = np.array(weights)
npf = np.array(fields)

num_fields = len(fields)
num_category = len(weights)

def f_original():
winner = -1
best = -float('inf')
for c in range(num_category):
score = 0
for i in range(num_fields):
score += float(fields_string[i]) * weights[c][i]
if score > best:
best = score
winner = c

def f_original_no_string():
winner = -1
best = -float('inf')
for c in range(num_category):
score = 0
for i in range(num_fields):
score += fields[i] * weights[c][i]
if score > best:
best = score
winner = c

def f_original_xrange():
winner = -1
best = -float('inf')
for c in xrange(num_category):
score = 0
for i in xrange(num_fields):
score += fields[i] * weights[c][i]
if score > best:
best = score
winner = c


# Zenon http://stackoverflow.com/a/10134298/1256624

def f_index_comprehension():
winner = -1
best = -float('inf')
for c in range(num_category):
score = sum(fields[i] * weights[c][i] for i in xrange(num_fields))
if score > best:
best = score
winner = c


# steveha http://stackoverflow.com/a/10134247/1256624

def f_comprehension():
winner = -1
best = -float('inf')

for c in xrange(num_category):
score = sum(f * w for f, w in itertools.izip(fields, weights[c]))
if score > best:
best = score
winner = c

def f_schwartz_original(): # https://en.wikipedia.org/wiki/Schwartzian_transform
tup = max(((i, sum(t[0] * t[1] for t in itertools.izip(fields, wlist))) for i, wlist in enumerate(weights)),
key=lambda t: t[1]
)

def f_schwartz_opt(): # https://en.wikipedia.org/wiki/Schwartzian_transform
tup = max(((i, sum(f * w for f,w in itertools.izip(fields, wlist))) for i, wlist in enumerate(weights)),
key=operator.itemgetter(1)
)

def fweight(field_float_list, wlist):
f = iter(field_float_list)
return sum(f.next() * w for w in wlist)

def f_schwartz_iterate():
tup = max(
((i, fweight(fields, wlist)) for i, wlist in enumerate(weights)),
key=lambda t: t[1]
)

# Nolen Royalty http://stackoverflow.com/a/10134147/1256624

def f_numpy_mult_sum():
np.argmax(np.sum(npf * npw, axis = 1))


# me

def f_imap():
winner = -1
best = -float('inf')

for c in xrange(num_category):
score = sum(itertools.imap(operator.mul, fields, weights[c]))
if score > best:
best = score
winner = c

def f_numpy():
np.argmax(npw.dot(npf))



for f in [f_original,
f_index_comprehension,
f_schwartz_iterate,
f_original_no_string,
f_schwartz_original,
f_original_xrange,
f_schwartz_opt,
f_comprehension,
f_imap]:
print "%s: %.2f ms" % (f.__name__, timeit.timeit(f,number=10)/10 * 1000)
for f in [f_numpy_mult_sum, f_numpy]:
print "%s: %.2f ms" % (f.__name__, timeit.timeit(f,number=100)/100 * 1000)

运行 python test_find_best.py 给我:

f_original: 310.34 ms
f_index_comprehension: 102.58 ms
f_schwartz_iterate: 103.39 ms
f_original_no_string: 96.36 ms
f_schwartz_original: 90.52 ms
f_original_xrange: 89.31 ms
f_schwartz_opt: 69.48 ms
f_comprehension: 68.87 ms
f_imap: 53.33 ms
f_numpy_mult_sum: 3.57 ms
f_numpy: 0.62 ms

所以使用 .dot 的 numpy 版本(抱歉,我找不到它的文档 atm)是最快的。如果您正在进行大量数值运算(看起来确实如此),可能值得在创建它们后立即将 fieldsweights 转换为 numpy 数组。

更新权重

Numpy 可能会为 update_weights 提供类似的加速,做类似的事情:

def update_weights(weights, fields, to_increase, to_decrease):
weights[to_increase,:] += fields
weights[to_decrease,:] -= fields
return weights

(顺便说一句,我还没有测试或分析过,你需要这样做。)

关于这段代码中的 Python 优化?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10134038/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com