gpt4 book ai didi

Azure VM加载运行时CuDNN库: 8. 2.4,但源是用: 8. 6.0编译的

转载 作者:行者123 更新时间:2023-12-02 23:14:33 26 4
gpt4 key购买 nike

我尝试在 Microsoft Azure 机器学习工作室 GPU 计算机的笔记本上安装 Keras 模型。我收到了类似于 here 所描述的错误:

2023-04-27 09:56:21.098249: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.2.4 but source was compiled with: 8.6.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-04-27 09:56:21.099011: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at pooling_ops_common.cc:412 : UNIMPLEMENTED: DNN library is not found.
2023-04-27 09:56:21.099050: I tensorflow/core/common_runtime/executor.cc:1197] [/job:localhost/replica:0/task:0/device:GPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): UNIMPLEMENTED: DNN library is not found.
[[{{node model_2/max_pooling1d_6/MaxPool}}]]
2023-04-27 09:56:21.100704: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:417] Loaded runtime CuDNN library: 8.2.4 but source was compiled with: 8.6.0. CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2023-04-27 09:56:21.101366: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at pooling_ops_common.cc:412 : UNIMPLEMENTED: DNN library is not found.

Azure 机器的解决方案是什么?

最佳答案

修复这个问题真是太麻烦了 - 我不知道为什么微软没有从 6.1 开始修复/升级 cuDNN 版本。附带的带有tensorflow的conda环境不起作用。

本质上,我们需要手动安装旧版本的tensorflow,或新版本的cuDNN。由于没有任何版本的tensorflow与cuDNN 6.1兼容,我们被迫升级cuDNN。

有效的解决方案如下:

  1. 在撰写本文时 - 您需要 cuDNN 版本 6.8(适用于 TF 1.12.x) - 从 here 获取 cuDNN 链接使用您的客户端计算机,但停止链接,以便您可以获得带有身份验证 key 的链接

From nVidia website enter image description here

  • 在下面的导出网址行中输入链接
  • 将其复制并粘贴到正在运行的计算终端
  • 等5分钟☕️
  • export URL="PASTE-LINK-HERE"
    # ==== DOWNLOAD CUDDN ====
    curl $URL -o ./cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
    sudo tar -xvf ./cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
    # ==== INSTALL CUDDN ====
    sudo cp ./cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include
    sudo cp -P ./cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64
    sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
    # ==== CONFIGURE DYNAMIC RUNTIME BINDINGS ====
    sudo ldconfig
    # ==== INSTALL CONDA ENV ====
    conda create -n "tfgpu" python=3.10 -y
    conda activate tfgpu
    conda install -c conda-forge cudatoolkit=11.8.0 ipykernel -y
    python3 -m pip install nvidia-cudnn-cu11==8.6.0.163 tensorflow==2.12.*
    mkdir -p $CONDA_PREFIX/etc/conda/activate.d
    echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
    echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
    source $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
    python3 -m ipykernel install --user --name tfgpu --display-name "Python (tf-cudnn8.6)"
    # ==== VERIFY ====
    python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

    tensorflow mnist example 上进行测试:

    enter image description here

    希望这会有所帮助!

    关于Azure VM加载运行时CuDNN库: 8. 2.4,但源是用: 8. 6.0编译的,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76139931/

    26 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com