python - RuntimeError : output with shape [1, 224, 224] 与广播形状 [3, 224, 224] 不匹配-6ren

python - RuntimeError : output with shape [1, 224, 224] 与广播形状 [3, 224, 224] 不匹配

转载作者：行者123 更新时间：2023-12-01 17:32:34

这是我尝试训练网络时遇到的错误。

我们用来存储 Caltech 101 数据集图像的类是由我们的老师提供的。

from torchvision.datasets import VisionDataset

from PIL import Image

import os
import os.path
import sys


def pil_loader(path):
    # open path as file to avoid ResourceWarning (https://github.com/python-pillow/Pillow/issues/835)
    with open(path, 'rb') as f:
        img = Image.open(f)
        return img.convert('RGB')


class Caltech(VisionDataset):
    def __init__(self, root, split='train', transform=None, target_transform=None):
        super(Caltech, self).__init__(root, transform=transform, target_transform=target_transform)

        self.split = split # This defines the split you are going to use
                           # (split files are called 'train.txt' and 'test.txt')

        '''
        - Here you should implement the logic for reading the splits files and accessing elements
        - If the RAM size allows it, it is faster to store all data in memory
        - PyTorch Dataset classes use indexes to read elements
        - You should provide a way for the __getitem__ method to access the image-label pair
          through the index
        - Labels should start from 0, so for Caltech you will have lables 0...100 (excluding the background class) 
        '''
        # Open file in read only mode and read all lines
        file = open(self.split, "r")
        lines = file.readlines()

        # Filter out the lines which start with 'BACKGROUND_Google' as asked in the homework
        self.elements = [i for i in lines if not i.startswith('BACKGROUND_Google')]

        # Delete BACKGROUND_Google class from dataset labels
        self.classes = sorted(os.listdir(os.path.join(self.root, "")))
        self.classes.remove("BACKGROUND_Google")


    def __getitem__(self, index):
        ''' 
        __getitem__ should access an element through its index
        Args:
            index (int): Index
        Returns:
            tuple: (sample, target) where target is class_index of the target class.
        '''

        img = Image.open(os.path.join(self.root, self.elements[index].rstrip()))

        target = self.classes.index(self.elements[index].rstrip().split('/')[0])

        image, label = img, target # Provide a way to access image and label via index
                           # Image should be a PIL Image
                           # label can be int

        # Applies preprocessing when accessing the image
        if self.transform is not None:
            image = self.transform(image)

        return image, label

    def __len__(self):
        '''
        The __len__ method returns the length of the dataset
        It is mandatory, as this is used by several other components
        '''
        # Provides a way to get the length (number of elements) of the dataset
        length =  len(self.elements)
        return length

而预处理阶段是由以下代码完成的:

# Define transforms for training phase
train_transform = transforms.Compose([transforms.Resize(256),      # Resizes short size of the PIL image to 256
                                      transforms.CenterCrop(224),  # Crops a central square patch of the image
                                                                   # 224 because torchvision's AlexNet needs a 224x224 input!
                                                                   # Remember this when applying different transformations, otherwise you get an error
                                      transforms.ToTensor(), # Turn PIL Image to torch.Tensor
                                      transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) # Normalizes tensor with mean and standard deviation
])
# Define transforms for the evaluation phase
eval_transform = transforms.Compose([transforms.Resize(256),
                                      transforms.CenterCrop(224),
                                      transforms.ToTensor(),
                                      transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))

最后这是数据集和数据加载器的准备:

# Clone github repository with data
if not os.path.isdir('./Homework2-Caltech101'):
  !git clone https://github.com/MachineLearning2020/Homework2-Caltech101.git

# Commands to execute when there is an error saying no file or directory related to ./Homework2-Caltech101/
# !rm -r ./Homework2-Caltech101/
# !git clone https://github.com/MachineLearning2020/Homework2-Caltech101.git

DATA_DIR = 'Homework2-Caltech101/101_ObjectCategories'
SPLIT_TRAIN = 'Homework2-Caltech101/train.txt'
SPLIT_TEST = 'Homework2-Caltech101/test.txt'


# 1 - Data preparation
myTrainDS = Caltech(DATA_DIR, split = SPLIT_TRAIN, transform=train_transform)
myTestDS = Caltech(DATA_DIR, split = SPLIT_TEST, transform=eval_transform)

print('My Train DS: {}'.format(len(myTrainDS)))
print('My Test DS: {}'.format(len(myTestDS)))

# 1 - Data preparation
myTrain_dataloader = DataLoader(myTrainDS, batch_size=BATCH_SIZE, shuffle=True, num_workers=4, drop_last=True)
myTest_dataloader = DataLoader(myTestDS, batch_size=BATCH_SIZE, shuffle=False, num_workers=4)

好吧，现在这两个 .txt 文件包含我们想要在训练和测试分割中拥有的图像列表，因此我们必须从那里获取它们，但这应该已经正确完成。问题是，当我接近训练阶段时(请参阅稍后的代码)，我会在标题中看到错误。我已经尝试在转换函数中添加以下行:

[...]
transforms.Lambda(lambda x: x.repeat(3, 1, 1)),

在centercrop之后，但它说Image没有属性重复，所以我有点卡住了。

给我带来错误的训练代码行如下:

# Iterate over the dataset
  for images, labels in myTrain_dataloader:

如果需要，完整错误为:

RuntimeError                              Traceback (most recent call last)

<ipython-input-197-0e4710a9855d> in <module>()
     47 
     48   # Iterate over the dataset
---> 49   for images, labels in myTrain_dataloader:
     50 
     51     # Bring data over the device of choice

2 frames

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in __next__(self)
    817             else:
    818                 del self._task_info[idx]
--> 819                 return self._process_data(data)
    820 
    821     next = __next__  # Python 2 compatibility

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in _process_data(self, data)
    844         self._try_put_index()
    845         if isinstance(data, ExceptionWrapper):
--> 846             data.reraise()
    847         return data
    848 

/usr/local/lib/python3.6/dist-packages/torch/_utils.py in reraise(self)
    383             # (https://bugs.python.org/issue2651), so we work around it.
    384             msg = KeyErrorMessage(msg)
--> 385         raise self.exc_type(msg)

RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "<ipython-input-180-0b00b175e18c>", line 72, in __getitem__
    image = self.transform(image)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py", line 70, in __call__
    img = t(img)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py", line 175, in __call__
    return F.normalize(tensor, self.mean, self.std, self.inplace)
  File "/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py", line 217, in normalize
    tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
RuntimeError: output with shape [1, 224, 224] doesn't match the broadcast shape [3, 224, 224]

我正在使用 Alexnet，我提供的代码如下:

    net = alexnet() # Loading AlexNet model

# AlexNet has 1000 output neurons, corresponding to the 1000 ImageNet's classes
# We need 101 outputs for Caltech-101
net.classifier[6] = nn.Linear(4096, NUM_CLASSES) # nn.Linear in pytorch is a fully connected layer
                                                 # The convolutional layer is nn.Conv2d

# We just changed the last layer of AlexNet with a new fully connected layer with 101 outputs
# It is mandatory to study torchvision.models.alexnet source code

最佳答案

张量的第一个维度表示颜色，因此您的错误意味着您给出的是灰度图片(1 个 channel )，而数据加载器需要 RGB 图像(3 个 channel )。您定义了一个返回 RGB 图像的 pil_loader 函数，但您从未使用过它。

所以你有两个选择:

使用灰度图像而不是 RGB，从计算角度来说，RGB 更便宜。解决方案:在训练和测试变换中将 transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) 更改为 transforms.Normalize ((0.5)，(0.5))
确保您的图像采用 RGB 格式。我不知道你的图像是如何存储的，但我猜你下载的是灰度数据集。您可以尝试的一件事是使用您定义的 pil_loader 函数。尝试将 img = Image.open(os.path.join(self.root, self.elements[index].rstrip())) 更改为 img = pil_loader(os.path.join (self.root, self.elements[index].rstrip())) 在 __getitem__ 函数中。

让我知道进展如何!

关于python - RuntimeError : output with shape [1, 224, 224] 与广播形状 [3, 224, 224] 不匹配，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59218671/

文章推荐： c# - GetSubKeyNames 不返回所有键

文章推荐： r - 在 SQL Server 中组合 dbplyr 和 case_when

文章推荐： azure - 使用 Adal 代表用户访问 Azure KeyVault

文章推荐： java - log4j 日志未写入文件的问题

python - Tensorflow Assign 要求两个张量的形状匹配。 lhs 形状= [20] rhs 形状= [48]
我是 TensorFlow 菜鸟。我已经从 deeppose 的开源实现中训练了一个 TensorFlow 模型，现在必须针对一组新图像运行该模型。该模型是在大小为 100 * 100 的图像上训练
形状=圆形的节点的大小
我正在尝试以这种方式设置节点的大小: controller[shape=circle,width=.5,label="Controller",style=filled,fillcolor="#8EC1
excel - 如何在选择中的每个单元格周围添加文本框/形状？
是否有 VBA 代码可以在选择的每个单元格周围添加文本框。文本框应该是单元格的大小(类似于边框)？最佳答案您可以使用 .AddTextbox方法。循环遍历您选择的单元格，并使用单元格的尺寸属性来设
平铺张量的 tensorflow 形状
我有一个变量 a尺寸 (1, 5) 我想“平铺”的次数与我的小批量的大小一样多。例如，如果小批量大小为 32，那么我想构造一个张量 c维度为 (32, 5)，其中每一行的值与原始 (1, 5) 变量
java - 在JavaFX中指定时间显示文本/形状
我在使用 javaFX 时遇到问题。我想每 1000 毫秒在应用程序窗口中显示一次时间。 public class Main extends Application { StackPane root
单击时的 JavaFX 形状
所以我目前正在创建这个 API。这个登录类应该只创建一个场景，其中包含制作 GUI 所需的所有框。我遇到的问题是，单击时我的形状不会执行任何操作。我有事件监听器，但它不起作用。 import
python turtle 形状
我正在用 python turtle 画一些东西，我使用了形状函数，但是形状 overdraw 了它们之前的其他形状(我可以看到形状在移动)，并且我只得到了最后一个形状: `up() goto(-20
python - 如何将选定的数据转换为相同的长度(形状)
我正在读取多个 .csv 文件作为具有相同形状的 panda DataFrame。对于某些索引，某些值为零，因此我想选择具有相同形状的每个索引的值，并为相同的索引放置零值并删除零以成为相同的形状: a
c# - 查找周长上的点以表示边界/形状
我有一个简单的二维网格，格式为 myGrid[x,y] 我正在尝试找到一种方法来找到围绕选定网格的周长，这样我就有了一个可供选择的形状。这是我的意思的一个例子: 这里的想法是找到所有相关的“角”，也
swift - 根据路径更改响应模型/形状
我有一个网络层，用于调用多个端点。我想减少重复代码的数量，并认为也许我可以将响应模型作为端点的一部分传递。这个想法是不需要多个仅因响应而不同的函数，我可以调用我的网络层并根据路径进行设置。我看到的
Android自定义 ImageView 形状
我正在创建一个自定义 ImageView，它将我的图像裁剪成六边形并添加边框。我想知道我的方法是否正确，或者我是否以错误的方式这样做。有很多自定义库已经在执行此操作，但开箱即用的库中没有一个具有我正在
python - 从节点云中查找几何(形状)
我正在编写一些代码，这些代码需要识别一些基于节点云的相当基本的几何图形。我会对检测感兴趣: 板(简单有界平面) 圆柱体(两个节点循环) 半圆柱(圆弧+直线+圆弧+直线) 圆顶(n*loop+top n
带有边框和角截断的 iOS 形状
我有这个形状: http://screencast.com/t/9UUhAXT5Wu 但边界在截止点处没有跟随它 - 我该如何解决？这是我当前 View 的代码: self.view.backgro
c++ - 简单的嵌套循环问题。 * 形状 *
我现在脑震荡，所以我想问一个非常简单的问题。目前，我正在尝试打印出这样的开头当输入为 7 时，输出为 * ** * ** * ** * 这里是我的代码，它打印 14 次而不是 7 次，或者当我输入
导航选项卡上的 CSS 形状
我想生成如下设计。计划选项卡顶部的"new"。我使用的属性适用于 chrome 和 mozilla，但在 Edge 中出现故障。以下是我在 chrome 中应用的样式: a.subnav__item
形状中的 Android 形状
我想要一个带有两种颜色边框轮廓的 shape 元素。我可以使用 solid 元素做一个单一的颜色轮廓，但这只允许我画一条线。我尝试在我的形状中使用两个 stroke 元素，但这也不起作用。有没有办法
ios - 如何绘制不同颜色的填充路径/形状
我需要为屏幕上的形状着色任何我想要的颜色。我目前正在尝试使用 UIImage 来执行此操作，我想根据自己的需要重新着色。据我所知，执行此操作的唯一方法是获取 UIImage 的各个像素，这需要更多我想
java - 面向对象的设计 - 形状
因此，经过多年的 OOP，我从我的一门大学类(class)中得到了一个非常简单的家庭作业，以实现一个简单的面向对象的结构。要求的设计: 实现面向对象的解决方案以创建以下形状: 椭圆、圆形、正方形、矩
CSS3 形状 - 什么是可能的？
关闭。这个问题需要更多focused .它目前不接受答案。想改进这个问题吗？更新问题，使其只关注一个问题 editing this post . 关闭 5 年前。 Improve this qu
css - 制作一个笨拙的 div 形状
我想知道是否可以使用类似于以下的 div 制作复杂的形状: 它基本上是一个四 Angular 向内收缩的圆 Angular 正方形。目标是使用背景图像来填充它。我可以使用具有以下 SVG 路径的剪辑蒙

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - RuntimeError : output with shape [1, 224, 224] 与广播形状 [3, 224, 224] 不匹配