gpt4 book ai didi

python - Faster R-CNN torchvision 实现的说明

转载 作者:行者123 更新时间:2023-12-04 08:25:08 27 4
gpt4 key购买 nike

我正在挖掘 source code torchvision 的 Faster R-CNN 实现我正面临一些我不太明白的事情。也就是说,假设我想创建一个 Faster R-CNN 模型,而不是在 COCO 上预训练,在 ImageNet 上预训练一个主干,然后只获取主干我执行以下操作:

plain_backbone = fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=True).backbone.body
这与主干的设置方式一致 herehere .但是,当我通过模型传递图像时,结果与如果我只是设置 resnet50 所获得的结果不符。直接地。即:
# Regular resnet50, pretrained on ImageNet, without the classifier and the average pooling layer
resnet50_1 = torch.nn.Sequential(*(list(torchvision.models.resnet50(pretrained=True).children())[:-2]))
resnet50_1.eval()
# Resnet50, extract from the Faster R-CNN, also pre-trained on ImageNet
resnet50_2 = fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=True).backbone.body
resnet50_2.eval()
# Loading a random image, converted to torch.Tensor, rescalled to [0, 1] (not that it matters)
image = transforms.ToTensor()(Image.open("random_images/random.jpg")).unsqueeze(0)
# Obtaining the model outputs
with torch.no_grad():
# Output from the regular resnet50
output_1 = resnet50_1(image)
# Output from the resnet50 extracted from the Faster R-CNN
output_2 = resnet50_2(image)["3"]
# Their outputs aren't the same, which I would assume they should be
np.testing.assert_almost_equal(output_1.numpy(), output_2.numpy())
期待你的想法!

最佳答案

这是因为 fasterrcnn_resnet50_fpn使用自定义归一化层 ( FrozenBatchNorm2d ) 而不是默认的 BatchNorm2D .它们非常相似,但我怀疑微小的数值差异会导致问题。
如果您指定用于标准 resnet 的相同规范化层,它将通过检查:

import torch
import torchvision
from torchvision.models.detection.faster_rcnn import fasterrcnn_resnet50_fpn
import numpy as np
from torchvision.ops import misc as misc_nn_ops

# Regular resnet50, pretrained on ImageNet, without the classifier and the average pooling layer
resnet50_1 = torch.nn.Sequential(*(list(torchvision.models.resnet50(pretrained=True, norm_layer=misc_nn_ops.FrozenBatchNorm2d).children())[:-2]))
resnet50_1.eval()
# Resnet50, extract from the Faster R-CNN, also pre-trained on ImageNet
resnet50_2 = fasterrcnn_resnet50_fpn(pretrained=False, pretrained_backbone=True).backbone.body
resnet50_2.eval()
# am too lazy to get a real image
image = torch.ones((1, 3, 224, 224))
# Obtaining the model outputs
with torch.no_grad():
# Output from the regular resnet50
output_1 = resnet50_1(image)
# Output from the resnet50 extracted from the Faster R-CNN
output_2 = resnet50_2(image)["3"]
# Passes
np.testing.assert_almost_equal(output_1.numpy(), output_2.numpy())

关于python - Faster R-CNN torchvision 实现的说明,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65305682/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com