gpt4 book ai didi

python - 为什么 "bare"CPU 上的 Keras 模型更快?

转载 作者:行者123 更新时间:2023-12-01 08:54:14 51 4
gpt4 key购买 nike

运行 Keras 模型...坏处是不使用 CPU 扩展会更快(应该是相反的情况)。查看下面的输出。

是否有一个配置文件可以设置 inter_op_parallelism 选项?

<小时/>
 Using TensorFlow backend.
2018-10-18 17:21:32.620461: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-10-18 17:21:32.621535: I tensorflow/core/common_runtime/process_util.cc:69] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
Results: -33.20 (23.69) MSE

real 2m55.990s
user 4m8.784s
sys 3m50.192s
<小时/>
 Using TensorFlow backend.
2018-10-18 17:25:04.773578: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Results: -32.57 (23.16) MSE

real 1m48.847s
user 2m15.792s
sys 0m13.440s

最佳答案

这是我在 keras 中使用的代码,只需将其放在代码的顶部即可。

from keras import backend as K
import tensorflow as tf

NUM_PARALLEL_EXEC_UNITS = 6

config = tf.ConfigProto(intra_op_parallelism_threads = NUM_PARALLEL_EXEC_UNITS,
inter_op_parallelism_threads = 1,
allow_soft_placement = True,
device_count = {'CPU': NUM_PARALLEL_EXEC_UNITS })

session = tf.Session(config=config)

K.set_session(session)

import os

os.environ["OMP_NUM_THREADS"] = str(NUM_PARALLEL_EXEC_UNITS)
os.environ["KMP_BLOCKTIME"] = "30"
os.environ["KMP_SETTINGS"] = "1"
os.environ["KMP_AFFINITY"]= "granularity=fine,verbose,compact,1,0"

注意:我对结果有点失望。仅使用这些参数,我就可以达到最大 150% 的加速。

关于python - 为什么 "bare"CPU 上的 Keras 模型更快?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52883145/

51 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com