gpt4 book ai didi

python - 如何有效计算 pandas 中列的线性组合?

转载 作者:行者123 更新时间:2023-11-30 22:33:48 25 4
gpt4 key购买 nike

我需要进行以下计算:

priors['user_product'] = priors.eval('product_id + user_id*100000')

其中user_product是我想要生成的新列。然而,由于先验数据帧很大(准确地说有 3000000 行),因此计算需要花费大量时间

最佳答案

如果你想要快速,你可以使用numpynumexpr或普通的pandas

Pandas

priors['user_product'] = priors.product_id + 100000 * priors.user_id

numpy

priors['user_product'] = priors.product_id.values + 100000 * priors.user_id.values

numexpr

pid = priors.product_id.values
uid = priors.user_id.values
priors['user_product'] = numexpr.evaluate('pid + 100000 * uid')

时机

n = 3000000
priors = pd.DataFrame(dict(product_id=np.random.rand(n), user_id=np.random.rand(n)))

%timeit priors['user_product'] = priors.eval('product_id + 100000 * user_id')
%timeit priors['user_product'] = priors.product_id.values + 100000 * priors.user_id.values
%timeit priors['user_product'] = priors.product_id + 100000 * priors.user_id

10 loops, best of 3: 31.6 ms per loop
100 loops, best of 3: 17.6 ms per loop
100 loops, best of 3: 18.5 ms per loop

%%timeit
pid = priors.product_id.values
uid = priors.user_id.values
priors['user_product'] = numexpr.evaluate('pid + 100000 * uid')

100 loops, best of 3: 13.6 ms per loop

关于python - 如何有效计算 pandas 中列的线性组合?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45036515/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com