gpt4 book ai didi

python - 填充洗牌缓冲区时,Google Colaboratory session 突然结束

转载 作者:行者123 更新时间:2023-12-04 03:50:54 26 4
gpt4 key购买 nike

我正在使用 Google Colaboratory 使用 TensorFlow 1.15 训练图像识别算法。我已将所有需要的文件上传到 Google 云端硬盘,并已获取运行代码,直到随机播放缓冲区运行完毕。但是,我在对话框中得到一个“^C”,并且无法弄清楚发生了什么。

注意:我之前曾尝试在我的 PC 上训练该算法,并且没有删除上一次训练生成的检查点文件。这可能是问题所在吗?

代码:

!pip install --upgrade pip
!pip install --upgrade protobuf

!pip install tensorflow-gpu==1.15
import tensorflow as tf
print(tf.__version__)

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
raise SystemError('GPU device not found')
print('Found GPU at {}'.format(device_name))

!ln -sf /opt/bin/nvidia-smi /usr/bin/nvidia-smi
!pip install gputil
!pip install psutil
!pip install humanize
import psutil
import humanize
import os
import GPUtil as GPU
GPUs = GPU.getGPUs()
gpu = GPUs[0]
def printm():
process = psutil.Process(os.getpid())
print("Gen RAM Free: " + humanize.naturalsize(psutil.virtual_memory().available ), " | Proc size: " + humanize.naturalsize( process.memory_info().rss))
print("GPU RAM Free: {0:.0f}MB | Used: {1:.0f}MB | Util {2:3.0f}% | Total {3:.0f}MB".format(gpu.memoryFree, gpu.memoryUsed, gpu.memoryUtil*100, gpu.memoryTotal))
printm()

from google.colab import drive
#Mount the drive
drive.mount('/content/gdrive')

#Change to working tensorflow directory on the drive
%cd '/content/gdrive/My Drive/weeds/tensorflow_models/models/research/object_detection/'

!apt-get install protobuf-compiler python-pil python-lxml python-tk
!pip install Cython
%cd /content/gdrive/My Drive/weeds/tensorflow_models/models/research/
!protoc object_detection/protos/*.proto --python_out=.
import os
os.environ['PYTHONPATH'] += ':/content/gdrive/My Drive/weeds/tensorflow_models/models/research/:/content/gdrive/My Drive/weeds/tensorflow_models/models/research/slim'
!python setup.py build
!python setup.py install

import time, psutil
Start = time.time() - psutil.boot_time()
Left = 12*3600 - Start
print('Time remaining for this session is: ', Left/3600)

!pip install tf_slim
%cd /content/gdrive/My Drive/weeds/tensorflow_models/models/research/object_detection/
os.environ['PYTHONPATH'] += ':/content/gdrive/My Drive/weeds/tensorflow_models/models/research/:/content/gdrive/My Drive/weeds/tensorflow_models/models/research/slim'

!python train.py --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_coco.config --logtostderr

过程到此结束,但它需要开始使用“全局步骤”训练模型。

2020-10-18 22:42:45.587477: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 168 of 2048
2020-10-18 22:42:55.668973: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 334 of 2048
2020-10-18 22:43:06.067869: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 379 of 2048
2020-10-18 22:43:15.705090: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 503 of 2048
2020-10-18 22:43:26.781151: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 576 of 2048
2020-10-18 22:43:38.120069: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 640 of 2048
2020-10-18 22:43:45.813089: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 708 of 2048
2020-10-18 22:43:58.071040: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 752 of 2048
2020-10-18 22:44:07.506961: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 828 of 2048
2020-10-18 22:44:16.355753: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 908 of 2048
2020-10-18 22:44:25.922348: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 960 of 2048
INFO:tensorflow:global_step/sec: 0
I1018 22:44:34.783342 140291121678080 supervisor.py:1099] global_step/sec: 0
2020-10-18 22:44:36.327813: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1036 of 2048
2020-10-18 22:44:45.651473: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1151 of 2048
2020-10-18 22:44:55.554234: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1186 of 2048
2020-10-18 22:45:05.648568: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1242 of 2048
2020-10-18 22:45:15.644396: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1313 of 2048
2020-10-18 22:45:25.551708: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1386 of 2048
2020-10-18 22:45:35.549003: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1458 of 2048
2020-10-18 22:45:45.648835: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1531 of 2048
2020-10-18 22:45:55.643920: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1602 of 2048
2020-10-18 22:46:05.559702: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1674 of 2048
2020-10-18 22:46:15.547609: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1746 of 2048
2020-10-18 22:46:25.645939: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1819 of 2048
INFO:tensorflow:global_step/sec: 0
I1018 22:46:35.052108 140291121678080 supervisor.py:1099] global_step/sec: 0
2020-10-18 22:46:35.645583: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1891 of 2048
2020-10-18 22:46:45.553851: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:145] Filling up shuffle buffer (this may take a while): 1962 of 2048
^C

我该怎么做才能解决这个问题?训练过程在我的 PC (NVIDA GEFORCE RTX) 上运行良好,但我只需要通过 Google Colab 获得更多计算能力。

最佳答案

我无法运行你的代码,因为你在其中使用了一些文件。但我可以告诉你,这可能是因为你使用的是 TF 1 和 GPU,而在 Colab 中,GPU 降级并不容易。

例如,我没有在您的代码中看到您已经像这样将 CUDA 降级(到您想要的版本):

!wget https://developer.nvidia.com/compute/cuda/9.0/Prod/local_installers/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
!dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64-deb
!apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
!apt-get update
!apt-get install cuda=9.0.176-1

您可以通过!nvcc --version查看CUDA的版本。

并且 Colab 降级 TensorFlow 版本的速度并不快。您可能需要多次重启运行时。

我建议您将代码更改为 TensorFlow 2

关于python - 填充洗牌缓冲区时,Google Colaboratory session 突然结束,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64419191/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com