gpt4 book ai didi

python - 为什么使用 tf.keras 的推理比使用 TFLite 慢 75 倍?

转载 作者:行者123 更新时间:2023-12-04 15:27:53 24 4
gpt4 key购买 nike

我运行一个代码,使用一个简单的 CNN 对音频数据进行一些预测。

使用时tf.keras.Model.predict我的平均执行时间为 0.17 秒,当我使用 TF.lite.Interpreter 时,我得到 0.002 秒,大约快 75 倍!我在我的桌面(Ubuntu 18.04,TF 2.1)和 Rapsberry Pi 3B+(Raspbian Buster,相同的代码)上尝试过,结果大致相同。

为什么差别那么大?

更新:我设置了 batch_size=1tf.keras.Model.predict现在它比 TFLite 慢 65 倍。

test_tflite.py

import os
import pathlib
import tensorflow as tf
from tensorflow.keras.models import model_from_json
import numpy as np
import time


# disable GPU
tf.config.set_visible_devices([], 'GPU')


parent = pathlib.Path(__file__).parent.absolute()

# path to Tensorflow model and weights
MODEL_PATH = os.path.join(parent, 'models/vd_model.json')
WEIGHTS_PATH = os.path.join(parent, 'models/model.30-0.97.h5')
INPUT_SHAPE = (1, 43, 40, 1)

NUM_RUN = 100


def predict_tflite(interpreter, input_details, output_details, data):
interpreter.set_tensor(input_details[0]['index'], data)
interpreter.invoke()
output_data = interpreter.get_tensor(output_details[0]['index'])
return output_data


def run():

# Load Tensorflow model
with open(MODEL_PATH, 'r') as f:
model = model_from_json(f.read())
model.load_weights(WEIGHTS_PATH)

# Show model
model.summary()

# Convert to TFLite
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
interpreter = tf.lite.Interpreter(model_content=tflite_model)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

predictions = []
for i in range(NUM_RUN):

# fake input data
data = np.random.rand(*INPUT_SHAPE).astype(np.float32)

# Tensorflow
start_time = time.time()
prediction = model.predict(data, batch_size=1)
elapsed = time.time() - start_time

# Tensoflow Lite
start_time = time.time()
prediction_tflite = predict_tflite(interpreter, input_details, output_details, data)
elapsed_tflite = time.time() - start_time

predictions.append(((elapsed, prediction), (elapsed_tflite, prediction_tflite)))

# Make sure predictions are close
for pred_tf, pred_tflite in predictions:
if not np.all(np.isclose(pred_tf[1], pred_tflite[1])):
print('Predictions are not close')

# Compute average execution times
tf_avg = np.mean([p[0] for p, _ in predictions])
tflite_avg = np.mean([p[0] for _, p in predictions])

print(f'TF: {tf_avg:.6f}')
print(f'TFLite: {tflite_avg:.6f}')


if __name__ == "__main__":
run()

执行(树莓派):
pi@raspberrypi:~/src/audio_monitoring/audio_monitoring/tests $ python3 test_tflite.py 
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 43, 40, 16) 160
_________________________________________________________________
batch_normalization (BatchNo (None, 43, 40, 16) 64
_________________________________________________________________
activation (Activation) (None, 43, 40, 16) 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 22, 20, 16) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 22, 20, 32) 4640
_________________________________________________________________
batch_normalization_1 (Batch (None, 22, 20, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 22, 20, 32) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 1, 1, 32) 0
_________________________________________________________________
dropout (Dropout) (None, 1, 1, 32) 0
_________________________________________________________________
flatten (Flatten) (None, 32) 0
_________________________________________________________________
dense (Dense) (None, 4) 132
=================================================================
Total params: 5,124
Trainable params: 5,028
Non-trainable params: 96
_________________________________________________________________

TF average prediction time: 0.168310s
TFLite average prediction time: 0.002269s

最佳答案

这种性能差异背后可能有很多原因,但总结一下:

  • 在 TFLite 模型转换时,应用了一些图优化(常量折叠、运算融合等)
  • 在转换时,静态执行计划是提前确定的。
  • 即使对于 CPU,TFLite 也经常为特定的 CPU 架构(例如,ARM 上的 NEON)提供优化的内核实现。

  • 也就是说,并非所有 TensorFlow 模型都可以转换为 TFLite,因为 TFLite 仅支持 TensorFlow 支持的操作的子集。

    我想你会发现这个技术谈话很有趣。请看一看。

    https://www.youtube.com/watch?v=gHN0jDbJz8E

    关于python - 为什么使用 tf.keras 的推理比使用 TFLite 慢 75 倍?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61889508/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com