gpt4 book ai didi

tensorflow - nvidia-smi 在未使用时显示 GPU 利用率

转载 作者:行者123 更新时间:2023-12-05 05:22:18 26 4
gpt4 key购买 nike

我使用 export CUDA_VISIBLE_DEVICES=1 在 GPU id 1 上运行 tensorflow,nvidia-smi 中的一切看起来都不错,我的 python 进程在 gpu 1 上运行,内存和功耗显示 GPU 1 是正在使用。

但奇怪的是,未使用的 GPU 0(基于进程列表、内存、电源使用情况和常识)显示 96% 的 volatile GPU 利用率。

有人知道为什么吗?

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.48 Driver Version: 367.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K20c Off | 0000:03:00.0 Off | 0 |
| 30% 41C P0 53W / 225W | 0MiB / 4742MiB | 96% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla K20c Off | 0000:43:00.0 Off | 0 |
| 36% 49C P0 95W / 225W | 4516MiB / 4742MiB | 63% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 1 5193 C python 4514MiB |
+-----------------------------------------------------------------------------+

最佳答案

运行 ps aux | grep 5193 查看哪个程序正在使用 GPU。

您的 GPU 启用了 ECC,因此您会看到高 CPU 或内存利用率。

During driver initialization when ECC is enabled one can see high GPU and Memory Utilization readings. This is caused by ECC Memory Scrubbing mechanism that is performed during driver initialization.
When Persistence Mode is Disabled, driver deinitializes when there are no clients running (CUDA apps or nvidia-smi or XServer) and needs to initialize again before any GPU application (like nvidia-smi) can query its state thus causing ECC Scrubbing.
As a rule of thumb always run with Persistence Mode Enabled. Just run as root nvidia-smi -pm 1. This will speed up application lunching by keeping the driver always loaded.

引用:https://devtalk.nvidia.com/default/topic/539632/k20-with-high-utilization-but-no-compute-processes-/

关于tensorflow - nvidia-smi 在未使用时显示 GPU 利用率,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40685084/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com