gpt4 book ai didi

python - Xtensor:无法达到 numpy 性能

转载 作者:行者123 更新时间:2023-11-30 04:45:09 26 4
gpt4 key购买 nike

我在学习xtensor并希望获得与 NumPy 相同甚至更高的性能。但不幸的是,我不能也需要帮助。

我做了与 here 类似的基准测试:

Performance of xtensor types vs. NumPy for simple reduction

这是 C++ 代码,我在其中使用了 pybind11 和 xtensor-python

bench.cpp

#include <iostream>

#define XTENSOR_USE_XSIMD
#include "xtensor/xtensor.hpp"
#include "xtensor/xfixed.hpp"
#include "xtensor/xarray.hpp"
#include "xtensor/xio.hpp"
#include "xtensor/xview.hpp"
#define FORCE_IMPORT_ARRAY // numpy C api loading
#include "xtensor-python/pytensor.hpp"
#include "xtensor-python/pyarray.hpp"

namespace py = pybind11;

inline double sum_pytensor(xt::pytensor<double, 1> &m)
{
return xt::sum(m)();
}



inline double sum_pytensor_immediate(xt::pytensor<double, 1> &m)
{
return xt::sum(m, xt::evaluation_strategy::immediate)();
}

PYBIND11_MODULE(xtensor_basics, m)
{
xt::import_numpy();

m.def("compute_xtensor", &sum_pytensor);
m.def("compute_xtensor_immediate", &sum_pytensor_immediate);
}

我用 CMake 构建这个

CMakeLists.txt

cmake_minimum_required(VERSION 2.8.12)
project(xtensor_basics)

add_definitions(-DXTENSOR_ENABLE_XSIMD) # <-- does this anything?
add_definitions(-DXTENSOR_USE_XSIMD)
add_subdirectory(pybind11)
pybind11_add_module(xtensor_basics bench.cpp)

include_directories(/home/--user--/include)
include_directories(/home/--user--/.miniconda3/lib/python3.7/site-packages/numpy/core/include)

和以下命令:cmake 。 && make 创建 xtensor_basics.cpython-37m-x86_64-linux-gnu.so

然后我用这个 python 文件运行基准测试:bench.py​​

import timeit

def time_each(func_names, sizes):
setup = f'''
import numpy; import xtensor_basics
arr = numpy.random.randn({sizes})
'''
tim = lambda func: min(timeit.Timer(f'{func}(arr)',
setup=setup).repeat(3, 100))
return [tim(func) for func in func_names]

from functools import partial

sizes = [10 ** i for i in range(7)]
funcs = ['numpy.sum',
'xtensor_basics.compute_xtensor_immediate',
'xtensor_basics.compute_xtensor']
sum_timer = partial(time_each, funcs)
times = list(map(sum_timer, sizes))
print(times)
from matplotlib import pyplot as plt

plt.Figure(figsize=(5, 10))
plt.plot(times)
plt.legend(["numpy", "xtensor_immediate", "xtensor"])
plt.show()

结果:

enter image description here

目录(构建后)

bench.cpp
bench.py
CMakeCache.txt
CMakeFiles
cmake_install.cmake
CMakeLists.txt
Makefile
pybind11 <---clonned from the repo
xtensor_basics.cpython-37m-x86_64-linux-gnu.so

包含目录所有包含标题的文件夹(我没有构建这些库,只是复制了标题)

$ ls /home/--user--/include -1
xflens
xsimd
xtensor
xtensor-blas
xtensor-python
xtl

系统

Ubuntu                         18.04
g++ 7.4.0
numpy 1.16.4
openblas 0.2.20
python 3.7.3
xtensor 0.20.8

问题:我应该添加哪些标志、定义等以获得相同的性能?

提前致谢。

编辑:1当我使用 cmake -DCMAKE_BUILD_TYPE=Release . 构建时,即启用优化,结果有所改善,但速度仍然较慢: enter image description here

最佳答案

稍微改变一下 CMakeLists.txt:

cmake_minimum_required(VERSION 2.8.12)
project(xtensor_basics)

add_definitions(-DXTENSOR_ENABLE_XSIMD)
add_definitions(-DXTENSOR_USE_XSIMD)\
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} -O3 -mavx2 -ffast-math")
# ^^^^^^^^^^^^^^^^^^^

add_subdirectory(pybind11)
pybind11_add_module(xtensor_basics bench.cpp)

include_directories(/home/--user--/include)
include_directories(/home/--user--/.miniconda3/lib/python3.7/site-packages/numpy/core/include)

然后……胜利! enter image description here

关于python - Xtensor:无法达到 numpy 性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57407106/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com