gpt4 book ai didi

python - YoloV5 在第一个 epoch 被杀死

转载 作者:行者123 更新时间:2023-12-02 02:16:02 25 4
gpt4 key购买 nike

我在 Windows 10 上使用具有此配置的虚拟机:

Memory 7.8 GiB
Processor Intel® Core™ i5-6600K CPU @ 3.50GHz × 3
Graphics llvmpipe (LLVM 11.0.0, 256 bits)
Disk Capcity 80.5 GB
OS Ubuntu 20.10 64 Bit
Virtualization Oracle

我按照 the official documentation 中的描述为 Ubuntu 安装了 docker .
我按照 yolo github section for docker 中的描述拉取了 docker 镜像.
由于我没有 NVIDIA GPU,因此无法安装驱动程序或 CUDA。我从roboflow中拉出水族箱并将其安装在折叠式水族箱上。我运行此命令以启动图像并安装我的水族馆文件夹

sudo docker run --ipc=host -it -v "$(pwd)"/Desktop/yolo/aquarium:/usr/src/app/aquarium ultralytics/yolov5:latest

迎接这个横幅

=============== PyTorch ==

NVIDIA Release 21.03 (build 21060478) PyTorch Version 1.9.0a0+df837d0

Container image Copyright (c) 2021, NVIDIA CORPORATION. All rightsreserved.

Copyright (c) 2014-2021 Facebook Inc. Copyright (c) 2011-2014 IdiapResearch Institute (Ronan Collobert) Copyright (c) 2012-2014 DeepmindTechnologies (Koray Kavukcuoglu) Copyright (c) 2011-2012 NECLaboratories America (Koray Kavukcuoglu) Copyright (c) 2011-2013 NYU
(Clement Farabet) Copyright (c) 2006-2010 NEC Laboratories America(Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston) Copyright(c) 2006 Idiap Research Institute (Samy Bengio) Copyright (c)2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio,Johnny Mariethoz) Copyright (c) 2015 Google Inc. Copyright (c)2015 Yangqing Jia Copyright (c) 2013-2016 The Caffe contributorsAll rights reserved.

NVIDIA Deep Learning Profiler (dlprof) Copyright (c) 2021, NVIDIACORPORATION. All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION. Allrights reserved.

This container image and its contents are governed by the NVIDIA DeepLearning Container License. By pulling and using the container, youaccept the terms and conditions of this license:https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

WARNING: The NVIDIA Driver was not detected. GPU functionality willnot be available. Use 'nvidia-docker run' to start this container;see https://github.com/NVIDIA/nvidia-docker/wiki/nvidia-docker .

NOTE: MOFED driver for multi-node communication was not detected.Multi-node communication performance may be reduced.

所以那里没有错误。
我安装了 pip,并使用 pip wandb 添加了 wandb。我使用了 wandb login 并设置了我的 API key 。

我运行了以下命令:

# python train.py --img 640 --batch 16 --epochs 10 --data ./aquarium/data.yaml --weights yolov5s.pt --project ip5 --name aquarium5 --nosave --cache

并收到此输出:

github: skipping check (Docker image)
YOLOv5 🚀 v5.0-14-g238583b torch 1.9.0a0+df837d0 CPU

Namespace(adam=False, artifact_alias='latest', batch_size=16, bbox_interval=-1, bucket='', cache_images=True, cfg='', data='./aquarium/data.yaml', device='', entity=None, epochs=10, evolve=False, exist_ok=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[640, 640], label_smoothing=0.0, linear_lr=False, local_rank=-1, multi_scale=False, name='aquarium5', noautoanchor=False, nosave=True, notest=False, project='ip5', quad=False, rect=False, resume=False, save_dir='ip5/aquarium5', save_period=-1, single_cls=False, sync_bn=False, total_batch_size=16, upload_dataset=False, weights='yolov5s.pt', workers=8, world_size=1)
tensorboard: Start with 'tensorboard --logdir ip5', view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
wandb: Currently logged in as: pebs (use `wandb login --relogin` to force relogin)
wandb: Tracking run with wandb version 0.10.26
wandb: Syncing run aquarium5
wandb: ⭐️ View project at https://wandb.ai/pebs/ip5
wandb: 🚀 View run at https://wandb.ai/pebs/ip5/runs/1c2j80ii
wandb: Run data is saved locally in /usr/src/app/wandb/run-20210419_102642-1c2j80ii
wandb: Run `wandb offline` to turn off syncing.

Overriding model.yaml nc=80 with nc=7

from n params module arguments
0 -1 1 3520 models.common.Focus [3, 32, 3]
1 -1 1 18560 models.common.Conv [32, 64, 3, 2]
2 -1 1 18816 models.common.C3 [64, 64, 1]
3 -1 1 73984 models.common.Conv [64, 128, 3, 2]
4 -1 1 156928 models.common.C3 [128, 128, 3]
5 -1 1 295424 models.common.Conv [128, 256, 3, 2]
6 -1 1 625152 models.common.C3 [256, 256, 3]
7 -1 1 1180672 models.common.Conv [256, 512, 3, 2]
8 -1 1 656896 models.common.SPP [512, 512, [5, 9, 13]]
9 -1 1 1182720 models.common.C3 [512, 512, 1, False]
10 -1 1 131584 models.common.Conv [512, 256, 1, 1]
11 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
12 [-1, 6] 1 0 models.common.Concat [1]
13 -1 1 361984 models.common.C3 [512, 256, 1, False]
14 -1 1 33024 models.common.Conv [256, 128, 1, 1]
15 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest']
16 [-1, 4] 1 0 models.common.Concat [1]
17 -1 1 90880 models.common.C3 [256, 128, 1, False]
18 -1 1 147712 models.common.Conv [128, 128, 3, 2]
19 [-1, 14] 1 0 models.common.Concat [1]
20 -1 1 296448 models.common.C3 [256, 256, 1, False]
21 -1 1 590336 models.common.Conv [256, 256, 3, 2]
22 [-1, 10] 1 0 models.common.Concat [1]
23 -1 1 1182720 models.common.C3 [512, 512, 1, False]
24 [17, 20, 23] 1 32364 models.yolo.Detect [7, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
[W NNPACK.cpp:80] Could not initialize NNPACK! Reason: Unsupported hardware.
Model Summary: 283 layers, 7079724 parameters, 7079724 gradients, 16.4 GFLOPS

Transferred 356/362 items from yolov5s.pt
Scaled weight_decay = 0.0005
Optimizer groups: 62 .bias, 62 conv.weight, 59 other
train: Scanning '/usr/src/app/aquarium/train/labels.cache' images and labels... 448 found, 0 missing, 1 empty, 0 corrupted: 100%|█| 448/448 [00:00<?, ?
train: Caching images (0.4GB): 100%|████████████████████████████████████████████████████████████████████████████████| 448/448 [00:01<00:00, 313.77it/s]
val: Scanning '/usr/src/app/aquarium/valid/labels.cache' images and labels... 127 found, 0 missing, 0 empty, 0 corrupted: 100%|█| 127/127 [00:00<?, ?it
val: Caching images (0.1GB): 100%|██████████████████████████████████████████████████████████████████████████████████| 127/127 [00:00<00:00, 141.31it/s]
Plotting labels...

autoanchor: Analyzing anchors... anchors/target = 5.17, Best Possible Recall (BPR) = 0.9997
Image sizes 640 train, 640 test
Using 3 dataloader workers
Logging results to ip5/aquarium5
Starting training for 10 epochs...

Epoch gpu_mem box obj cls total labels img_size
0%| | 0/28 [00:00<?, ?it/s]Killed
root@cf40a6498016:~# /opt/conda/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 6 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

根据这个输出,我认为完成了 0 个纪元。
我的 data.yaml 包含这段代码:

train: /usr/src/app/aquarium/train/images
val: /usr/src/app/aquarium/valid/images

nc: 7
names: ['fish', 'jellyfish', 'penguin', 'puffin', 'shark', 'starfish', 'stingray']

wandb.ai不显示任何指标,但我有文件 config.yaml、requirements.txt、wandb-metadata.json 和 wandb-summary.json。

为什么我没有得到任何输出?
难道真的没有培训吗?
如果有培训,我该如何使用我的模型?

最佳答案

问题是虚拟机内存不足。解决方案是创建 16 GB 的交换内存,这样机器就可以将虚拟硬盘驱动器用作 RAM。

关于python - YoloV5 在第一个 epoch 被杀死,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67160576/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com