gpt4 book ai didi

python - 如何修复 "ResourceExhaustedError: OOM when allocating tensor"

转载 作者:行者123 更新时间:2023-12-03 09:45:01 28 4
gpt4 key购买 nike

我想制作一个具有多个输入的模型。所以,我尝试建立一个这样的模型。

# define two sets of inputs
inputA = Input(shape=(32,64,1))
inputB = Input(shape=(32,1024))

# CNN
x = layers.Conv2D(32, kernel_size = (3, 3), activation = 'relu')(inputA)
x = layers.Conv2D(32, (3,3), activation='relu')(x)
x = layers.MaxPooling2D(pool_size=(2,2))(x)
x = layers.Dropout(0.2)(x)
x = layers.Flatten()(x)
x = layers.Dense(500, activation = 'relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(500, activation='relu')(x)
x = Model(inputs=inputA, outputs=x)

# DNN
y = layers.Flatten()(inputB)
y = Dense(64, activation="relu")(y)
y = Dense(250, activation="relu")(y)
y = Dense(500, activation="relu")(y)
y = Model(inputs=inputB, outputs=y)

# Combine the output of the two models
combined = concatenate([x.output, y.output])


# combined outputs
z = Dense(300, activation="relu")(combined)
z = Dense(100, activation="relu")(combined)
z = Dense(1, activation="softmax")(combined)

model = Model(inputs=[x.input, y.input], outputs=z)

model.summary()

opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss = 'sparse_categorical_crossentropy', optimizer = opt,
metrics = ['accuracy'])
和总结
:
_
但是,当我尝试训练这个模型时,
history = model.fit([trainimage, train_product_embd],train_label,
validation_data=([validimage,valid_product_embd],valid_label), epochs=10,
steps_per_epoch=100, validation_steps=10)
问题发生了....
:
 ResourceExhaustedError                    Traceback (most recent call
last) <ipython-input-18-2b79f16d63c0> in <module>()
----> 1 history = model.fit([trainimage, train_product_embd],train_label,
validation_data=([validimage,valid_product_embd],valid_label),
epochs=10, steps_per_epoch=100, validation_steps=10)

4 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py
in __call__(self, *args, **kwargs) 1470 ret =
tf_session.TF_SessionRunCallable(self._session._session, 1471
self._handle, args,
-> 1472 run_metadata_ptr) 1473 if run_metadata: 1474
proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

ResourceExhaustedError: 2 root error(s) found. (0) Resource
exhausted: OOM when allocating tensor with shape[800000,32,30,62] and
type float on /job:localhost/replica:0/task:0/device:GPU:0 by
allocator GPU_0_bfc [[{{node conv2d_1/convolution}}]] Hint: If you
want to see a list of allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current
allocation info.

[[metrics/acc/Mean_1/_185]] Hint: If you want to see a list of
allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current
allocation info.

(1) Resource exhausted: OOM when allocating tensor with
shape[800000,32,30,62] and type float on
/job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node conv2d_1/convolution}}]] Hint: If you want to see a list of
allocated tensors when OOM happens, add
report_tensor_allocations_upon_oom to RunOptions for current
allocation info.

0 successful operations. 0 derived errors ignored.
感谢阅读并希望能帮助我:)

最佳答案

OOM 代表“内存不足”。您的 GPU 内存不足,因此无法为该张量分配内存。您可以执行以下操作:

  • 减少 Dense 中的过滤器数量, Conv2D
  • 使用较小的 batch_size (或增加 steps_per_epochvalidation_steps)
  • 使用灰度图像(您可以使用 tf.image.rgb_to_grayscale )
  • 减少层数
  • 使用 MaxPooling2D卷积层之后的层
  • 减小图像的大小(您可以使用 tf.image.resize )
  • 使用较小的 float您输入的精度,即 np.float32
  • 如果您使用的是预训练模型,请卡住第一层(如 this )

  • 有关此错误的更多有用信息:
    OOM when allocating tensor with shape[800000,32,30,62]
    这是一个奇怪的形状。如果您正在处理图像,通常应该有 3 或 1 个 channel 。最重要的是,您似乎一次传递了整个数据集;你应该分批传递它。

    关于python - 如何修复 "ResourceExhaustedError: OOM when allocating tensor",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59394947/

    28 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com