gpt4 book ai didi

python - 神经网络Python错误 "Failed to get convolution algorithm"

转载 作者:行者123 更新时间:2023-12-01 06:46:23 24 4
gpt4 key购买 nike

我正在尝试在 Python 中运行一些神经网络代码。我在 Google Colab 上运行得很好。然后,我将代码移至远程计算机 GPU 上的 Jupyter Notebook。

它运行正常,直到我尝试使用以下方法拟合模型:

history = model.fit_generator(generator=training_generator, validation_data=validation_generator, use_multiprocessing=True, workers=1, epochs=100, shuffle=True, verbose=1)

完整的错误消息如下。我只是不知道从哪里开始理解它的含义,所以我正在寻求帮助。提前致谢:

UnknownError                              Traceback (most recent call last)
<ipython-input-15-d3d33225fec8> in <module>
1 # Train model on dataset
----> 2 history = model.fit_generator(generator=training_generator, validation_data=validation_generator, use_multiprocessing=True, workers=1, epochs=100, shuffle=True, verbose=1)

~/miniconda3/lib/python3.7/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name + '` call to the ' +
90 'Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper

~/miniconda3/lib/python3.7/site-packages/keras/engine/training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
1416 use_multiprocessing=use_multiprocessing,
1417 shuffle=shuffle,
-> 1418 initial_epoch=initial_epoch)
1419
1420 @interfaces.legacy_generator_methods_support

~/miniconda3/lib/python3.7/site-packages/keras/engine/training_generator.py in fit_generator(model, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
215 outs = model.train_on_batch(x, y,
216 sample_weight=sample_weight,
--> 217 class_weight=class_weight)
218
219 outs = to_list(outs)

~/miniconda3/lib/python3.7/site-packages/keras/engine/training.py in train_on_batch(self, x, y, sample_weight, class_weight)
1215 ins = x + y + sample_weights
1216 self._make_train_function()
-> 1217 outputs = self.train_function(ins)
1218 return unpack_singleton(outputs)
1219

~/miniconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in __call__(self, inputs)
2713 return self._legacy_call(inputs)
2714
-> 2715 return self._call(inputs)
2716 else:
2717 if py_any(is_tensor(x) for x in inputs):

~/miniconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py in _call(self, inputs)
2673 fetched = self._callable_fn(*array_vals, run_metadata=self.run_metadata)
2674 else:
-> 2675 fetched = self._callable_fn(*array_vals)
2676 return fetched[:len(self.outputs)]
2677

~/miniconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py in __call__(self, *args, **kwargs)
1437 ret = tf_session.TF_SessionRunCallable(
1438 self._session._session, self._handle, args, status,
-> 1439 run_metadata_ptr)
1440 if run_metadata:
1441 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
526 None, None,
527 compat.as_text(c_api.TF_Message(self.status.status)),
--> 528 c_api.TF_GetCode(self.status.status))
529 # Delete the underlying status object from memory otherwise it stays alive
530 # as there is a reference to status from this from the traceback due to

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
[[{{node conv2d_1/convolution}}]]
[[{{node metrics/acc/Mean}}]]

最佳答案

正如 @thushv89 所说,这是 TF 二进制文件和已安装的 CUDNN 版本的兼容性问题。

您可以使用以下方法检查您的tensorflow版本:

python -c 'import tensorflow as tf; print(tf.__version__);'

然后在此处检查所需的 CUDA/CUDNN 版本: https://www.tensorflow.org/install/source#tested_build_configurations

注意:所示的 CUDA/CUDNN 版本仅与 TF 的官方发行版相关。对于conda来说应该有更好的处理方式。

然后你可以检查你的CUDA版本:

nvcc --version

然后使用以下方法之一检查您的 CUDNN 版本:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2

关于python - 神经网络Python错误 "Failed to get convolution algorithm",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59203601/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com