gpt4 book ai didi

python - ufunc 算术表达式中的内存消耗

转载 作者:行者123 更新时间:2023-12-01 01:53:08 26 4
gpt4 key购买 nike

算术 numpy 表达式的内存消耗是多少

vec ** 3 + vec ** 2 + vec

(vec 是 numpy.ndarray)。是否为每个中间操作存储一个数组?这样的复合表达式是否可以比底层 ndarray 拥有多倍的内存?

最佳答案

你是对的,将为每个中间结果分配一个新数组。幸运的是,numexpr 包就是为了解决这个问题而设计的。从描述来看:

The main reason why NumExpr achieves better performance than NumPy is that it avoids allocating memory for intermediate results. This results in better cache utilization and reduces memory access in general. Due to this, NumExpr works best with large arrays.

示例:

In [97]: xs = np.random.rand(1_000_000)

In [98]: %timeit xs ** 3 + xs ** 2 + xs
26.8 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [99]: %timeit numexpr.evaluate('xs ** 3 + xs ** 2 + xs')
1.43 ms ± 20.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

感谢 @max9111 指出 numexpr 简化了乘法运算。看来基准测试中的大部分差异都是通过 xs ** 3 的优化来解释的。

In [421]: %timeit xs * xs
1.62 ms ± 12 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [422]: %timeit xs ** 2
1.63 ms ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [423]: %timeit xs ** 3
22.8 ms ± 283 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [424]: %timeit xs * xs * xs
2.52 ms ± 58.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

关于python - ufunc 算术表达式中的内存消耗,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50528634/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com