gpt4 book ai didi

linux - 无法在 GPU 上运行 tensorflow

转载 作者:太空狗 更新时间:2023-10-29 12:42:13 24 4
gpt4 key购买 nike

我无法运行安装指南中给出的 TF-CUDA tutorials_example_trainer (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#installing-from-sources)

我以前遇到过 CUDA 库的问题,但那是与图形相关的演示。

下面的所有细节,预先感谢您提供的帮助。

环境信息

操作系统:Debian Stretch

已安装的 CUDA 和 cuDNN 版本:8.0, 5.0

如果从源安装,提供

  1. 554ddd9ad2d4abad5a9a31f2d245f0b1012f0d10
  2. 构建标签:0.3.0构建目标:bazel-out/local-fastbuild/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar构建时间:Fri Jun 10 11:38:23 2016 (1465558703)

重现步骤

  1. 使用 367.35 驱动程序从源代码构建
  2. 运行 bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu

日志或其他有用的输出

bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
modprobe: ERROR: ../libkmod/libkmod-module.c:832 kmod_module_insert_module() could not find module by name='nvidia_367_uvm'
modprobe: ERROR: could not insert 'nvidia_367_uvm': Unknown symbol in module, or unknown parameter (see dmesg)
E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_UNKNOWN
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:153] retrieving CUDA diagnostic information for host: debian
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:160] hostname: debian
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:185] libcuda reported version is: 367.35.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:356] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.35 Mon Jul 11 23:14:21 PDT 2016
GCC version: gcc version 5.4.0 20160609 (Debian 5.4.0-6)
"""
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] kernel reported version is: 367.35.0
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:293] kernel version seems to match DSO: 367.35.0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:81] No GPU devices available on machine.
F tensorflow/cc/tutorials/example_trainer.cc:125] Check failed: ::tensorflow::Status::OK() == (session->Run({{"x", x}}, {"y:0", "y_normalized:0"}, {}, &outputs)) (OK vs. Invalid argument: Cannot assign a device to node 'y': Could not satisfy explicit device specification '/gpu:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
[[Node: y = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/gpu:0"](Const, x)]])

最佳答案

错误信息表明你的GPU驱动没有设置好。您可以尝试使用以下命令查看驱动是否安装正确。

$ nvidia-smi

如果没有,请按照 CUDA 官方网站上的说明重新安装 CUDA。由于您的操作系统不受官方支持,您可能需要更换操作系统。

关于linux - 无法在 GPU 上运行 tensorflow,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38663168/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com