gpt4 book ai didi

docker - NVIDIA Docker - 初始化错误 : nvml error: driver not loaded

转载 作者:行者123 更新时间:2023-12-05 06:10:56 25 4
gpt4 key购买 nike

我是 Docker 的新手,所以以下问题可能有点幼稚,但我被卡住了,我需要帮助。

我正在尝试重现一些研究结果。作者只是 released code along with a specification of how to build a Docker image重现他们的结果。相关位复制如下:

enter image description here

我相信我正确安装了 Docker:

$ docker --version
Docker version 19.03.13, build 4484c46d9d
$ sudo docker run hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/

For more examples and ideas, visit:
https://docs.docker.com/get-started/

但是,当我尝试检查我的 nvidia-docker 安装是否成功时,我收到以下错误:

$ sudo docker run --gpus all --rm nvidia/cuda:10.1-base nvidia-smi
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: nvml error: driver not loaded\\\\n\\\"\"": unknown.

看起来关键错误是:

nvidia-container-cli:初始化错误:nvml 错误:未加载驱动程序

我在本地没有 GPU,我发现关于是否需要在 NVIDIA Docker 之前安装 CUDA 的信息相互矛盾。例如,this NVIDIA moderator says “正确的 nvidia docker 插件安装始于在基础机器上正确安装 CUDA。”

我的问题如下:

  1. 我可以在没有安装 CUDA 的情况下安装 NVIDIA Docker 吗?

  2. 如果是这样,这个错误的根源是什么,我该如何解决?

  3. 如果没有,我该如何创建这个 Docker 镜像来重现结果?

最佳答案

  1. Can I install NVIDIA Docker without having CUDA installed?

是的,你可以。 readme声明 nvidia-docker 只需要安装 NVIDIA GPU 驱动程序和 Docker 引擎:

Note that you do not need to install the CUDA Toolkit on the host system, but the NVIDIA driver needs to be installed

  1. If so, what is the source of this error and how do I fix it?

那是因为您本地没有 GPU 或者不是 NVIDIA,或者您在安装驱动程序时搞砸了某个地方。如果您有支持 CUDA 的 GPU,我建议使用 NVIDIA guide安装驱动程序。如果您本地没有 GPU,您仍然可以使用 CUDA 构建图像,然后将其移动到有 GPU 的地方。

  1. If not, how do I create this Docker image to reproduce the results?

问题是,即使你设法摆脱了 Docker 镜像中的 CUDA,也有软件需要它。在这种情况下,修复 Dockerfile 在我看来是不必要的——您可以忽略 Docker 并开始修复代码以在 CPU 上运行它。

关于docker - NVIDIA Docker - 初始化错误 : nvml error: driver not loaded,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64197626/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com