gpt4 book ai didi

tensorflow - 获取错误 "Resource exhausted: OOM when allocating tensor with shape[1800,1024,28,28] and type float on/job:localhost/..."

转载 作者:行者123 更新时间:2023-12-05 09:30:29 29 4
gpt4 key购买 nike

我在开始训练我的对象检测 Tensorflow 2.5 GPU 模型时遇到资源耗尽错误。我使用了 18 张训练图像和 3 张测试图像。我使用的预训练模型是来自 Tensorflow zoo 2.2 的 Faster R-CNN ResNet101 V1 640x640 模型。我正在使用带有 8 GB 专用内存的 Nvidia RTX 2070 来训练我的模型。

令我感到困惑的是,为什么训练过程在训练集如此小的情况下会占用我的 GPU 如此多的内存。这是我与错误相处的GPU内存总结:

Limit:                      6269894656
InUse: 6103403264
MaxInUse: 6154866944
NumAllocs: 4276
MaxAllocSize: 5786902272
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0

我还将训练数据的批量大小减少到 6,测试数据的批量大小减少到 1。

最佳答案

我在所有在 gpu 上运行的笔记本中使用下面的代码,以防止此类错误:

    import tensorflow as tf

gpus = tf.config.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)

By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES) visible to the process.

More information about using gpu with tensorflow here

也许它会解决错误

希望对你有帮助

关于tensorflow - 获取错误 "Resource exhausted: OOM when allocating tensor with shape[1800,1024,28,28] and type float on/job:localhost/...",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69545763/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com