matlab - 为什么MatConvNet说数据和导数没有匹配的格式？-6ren

matlab - 为什么MatConvNet说数据和导数没有匹配的格式？

转载作者：行者123 更新时间：2023-11-30 08:41:14

问题陈述

我使用 MatConvNet 使用示例库附带的函数 cnn_train 构建一个非常简单的一维示例和小型网络。按照他们的示例，我构建了一个小型 CNN 示例，如下所示:

    clc;clear;clc;clear;
%% prepare Data
M = 32; %batch size
X_train = zeros(1,1,1,M); % (1 1 1 2) = (1 1 1 M)
for m=1:M,
    X_train(:,:,:,m) = m; %training example value
end
Y_test = 10*X_train;
split = ones(1,M);
split(floor(M*0.75):end) = 2;
% load image dadabase (imgdb)
imdb.images.data = X_train;
imdb.images.label = Y_test;
imdb.images.set = split;
%% prepare parameters
L1=3;
w1 = randn(1,1,1,L1); %1st layer weights
w2 = randn(1,1,1,L1); %2nd layer weights
b1 = randn(1,1,1,L1); %1st layer biases
b2 = randn(1,1,1,L1); %2nd layer biases
G1 = ones(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN scale, one per  dimension
B1 = zeros(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN shift, one per  dimension
EPS = 1e-4;
%% make CNN layers: conv, BN, relu, conv, pdist, l2-loss
net.layers = {} ;
net.layers{end+1} = struct('type', 'conv', ...
                           'name', 'conv1', ...
                           'weights', {{w1, b1}}, ...
                           'pad', 0) ;
net.layers{end+1} = struct('type', 'bnorm', ...
                           'weights', {{G1, B1}}, ...
                           'EPSILON', EPS, ...
                           'learningRate', [1 1 0.05], ...
                           'weightDecay', [0 0]) ;                       
net.layers{end+1} = struct('type', 'relu', ...
                           'name', 'relu1' ) ;
net.layers{end+1} = struct('type', 'conv', ...
                           'name', 'conv2', ...
                           'weights', {{w2, b2}}, ...
                           'pad', 0) ;
net.layers{end+1} = struct('type', 'pdist', ...
                           'name', 'averageing1', ...
                           'class', 0, ...
                           'p', 1) ;
%% add L2-loss                   
fwfun = @l2LossForward;
bwfun = @l2LossBackward;
net = addCustomLossLayer(net, fwfun, bwfun) ;
net.layers{end}.class = Y_test; % its the test set
net = vl_simplenn_tidy(net) ;
res = vl_simplenn(net, X_train);
%% prepare train options
trainOpts.expDir = 'results/' ; %save results/trained cnn
trainOpts.gpus = [] ;
trainOpts.batchSize = 2 ;
trainOpts.learningRate = 0.02 ;
trainOpts.plotDiagnostics = false ;
%trainOpts.plotDiagnostics = true ; % Uncomment to plot diagnostics
trainOpts.numEpochs = 20 ; % number of training epochs
trainOpts.errorFunction = 'none' ;
%% CNN TRAIN
vl_simplenn_display(net) ;
net = cnn_train(net, imdb, @getBatch, trainOpts) ;

我根据 the example they provided 创建了这个，每当我运行该示例时，我都会收到错误:

Error using vl_nnconv
DATA and DEROUTPUT do not have compatible formats.

Error in vl_simplenn (line 397)
          [res(i).dzdx, dzdw{1}, dzdw{2}] = vl_nnconv(res(i).x, l.weights{1},
          l.weights{2}, res(i+1).dzdx)

Error in cnn_train>process_epoch (line 323)
    res = vl_simplenn(net, im, dzdy, res, ...

Error in cnn_train (line 139)
    [net,stats.train,prof] = process_epoch(opts, getBatch, epoch, train, learningRate,
    imdb, net) ;

Error in main_1D_1layer_hard_coded_example (line 64)
net = cnn_train(net, imdb, @getBatch, trainOpts) ;

有人知道发生了什么事吗？这个例子实际上应该很简单，所以它让我困惑什么可能是错误的。

<小时/>

我尝试解决此问题的补充部分。

有关我尝试解决此问题的更多详细信息，请提前阅读。

我转到文件中导致错误的那一行，并将输入打印到该函数，以确保我给出的参数有意义，并且在这方面似乎一切都很好:

  case 'conv'
      size(res(i).x)
      size(res(i+1).dzdx)
      size(l.weights{1})
      size(l.weights{2})
      [res(i).dzdx, dzdw{1}, dzdw{2}] = vl_nnconv(res(i).x, l.weights{1}, l.weights{2}, res(i+1).dzdx)
    [res(i).dzdx, dzdw{1}, dzdw{2}] = ...
      vl_nnconv(res(i).x, l.weights{1}, l.weights{2}, res(i+1).dzdx, ...
      'pad', l.pad, ...
      'stride', l.stride, ...
      l.opts{:}, ...
      cudnn{:}) ;

打印:

ans =

     1     1     3    16


ans =

     1     1     3    16


ans =

     1     1     1     3


ans =

     1     1     1     3

我所期望的。

我什至手动硬编码了网络应该计算的衍生品链，并且该文件似乎工作正常:

clc;clear;clc;clear;
%% prepare Data
M = 3;
x = zeros(1,1,1,M); % (1 1 1 2) = (1 1 1 M)
for m=1:M,
    x(:,:,:,m) = m;
end
Y = 5;
r=Y;
%% parameters
L1 = 3;
w1 = randn(1,1,1,L1); % (1 1 1 L1) = (1 1 1 3)
b1 = ones(1,L1);
w2 = randn(1,1,1,L1); % (1 1 1 L1) = (1 1 1 3)
b2 = ones(1,L1);
G1 = ones(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN scale, one per  dimension
B1 = zeros(1,1,1,L1); % (1 1 1 3) = (1 1 1 L1) BN shift, one per  dimension
EPS = 1e-4;
%% Forward Pass
z1 = vl_nnconv(x,w1,b1); % (1 1 3 2) = (1 1 L1 M)
%bn1 = z1;
bn1 = vl_nnbnorm(z1,G1,B1,'EPSILON',EPS); % (1 1 3 2) = (1 1 L1 M)
a1 = vl_nnrelu(bn1); % (1 1 3 2) = (1 1 L1 M) 
z2 = vl_nnconv(a1,w2,b2);
y1 = vl_nnpdist(z2, 0, 1);
loss_forward = l2LossForward(y1,Y);
%%
net.layers = {} ;
net.layers{end+1} = struct('type', 'conv', ...
                           'name', 'conv1', ...
                           'weights', {{w1, b1}}, ...
                           'pad', 0) ;
net.layers{end+1} = struct('type', 'bnorm', ...
                           'weights', {{G1, B1}}, ...
                           'EPSILON', EPS, ...
                           'learningRate', [1 1 0.05], ...
                           'weightDecay', [0 0]) ;                       
net.layers{end+1} = struct('type', 'relu', ...
                           'name', 'relu1' ) ;
net.layers{end+1} = struct('type', 'conv', ...
                           'name', 'conv2', ...
                           'weights', {{w2, b2}}, ...
                           'pad', 0) ;
net.layers{end+1} = struct('type', 'pdist', ...
                           'name', 'averageing1', ...
                           'class', 0, ...
                           'p', 1) ;
fwfun = @l2LossForward;
bwfun = @l2LossBackward;
net = addCustomLossLayer(net, fwfun, bwfun) ;
net.layers{end}.class = Y;
net = vl_simplenn_tidy(net) ;
res = vl_simplenn(net, x);
%%
loss_forward = squeeze( loss_forward ) % (1 1)
loss_res = squeeze( res(end).x ) % (1 1)
%% Backward Pass
p = 1;
dldx = l2LossBackward(y1,r,p);
dy1dx = vl_nnpdist(z2, 0, 1, dldx);
[dz2dx, dz2dw2] = vl_nnconv(a1, w2, b2, dy1dx);
da1dx = vl_nnrelu(bn1, dz2dx);
[dbn1dx,dbn1dG1,dbn1dB1] = vl_nnbnorm(z1,G1,B1,da1dx);
[dz1dx, dz1dw1] = vl_nnconv(x, w1, b1, dbn1dx);
%%
dzdy = 1;
res = vl_simplenn(net, x, dzdy, res);
%%
% func = @(x) proj(p, forward(x, x0)) ;
% err = checkDerivativeNumerically(f, x, dx)
% %%
dz1dx = squeeze(dz1dx)
dz1dx_vl_simplenn = squeeze(res(1).dzdx)

导数似乎是数学式的，所以我假设该文件中的所有内容都有效。它不会抛出错误，所以它甚至没有运行这一事实让我非常困惑。有人知道这是怎么回事吗？

<小时/>

我加载 CNN 的方式基于 the example file他们提供了该教程。我将粘贴该文件的重要方面的摘要(可以与 cnn_train 函数一起正常运行，而我的则不能)。

setup() ;
% setup('useGpu', true); % Uncomment to initialise with a GPU support
%% Part 3.1: Prepare the data
% Load a database of blurred images to train from
imdb = load('data/text_imdb.mat') ;

%% Part 3.2: Create a network architecture

net = initializeSmallCNN() ;
%net = initializeLargeCNN() ;
% Display network
vl_simplenn_display(net) ;

%% Part 3.3: learn the model
% Add a loss (using a custom layer)
net = addCustomLossLayer(net, @l2LossForward, @l2LossBackward) ;

% Train
trainOpts.expDir = 'data/text-small' ;
trainOpts.gpus = [] ;
% Uncomment for GPU training:
%trainOpts.expDir = 'data/text-small-gpu' ;
%trainOpts.gpus = [1] ;
trainOpts.batchSize = 16 ;
trainOpts.learningRate = 0.02 ;
trainOpts.plotDiagnostics = false ;
%trainOpts.plotDiagnostics = true ; % Uncomment to plot diagnostics
trainOpts.numEpochs = 20 ;
trainOpts.errorFunction = 'none' ;

net = cnn_train(net, imdb, @getBatch, trainOpts) ;

最佳答案

w2 的尺寸应为 1x1x3x3。

通常偏差会被指定为 1x3，因为它们只有一个维度(或者权重为 1x1x3xN，相应偏差为 1xN，其中 N 是滤波器的数量)，B1 和 G1 也是如此(这里是1xM，其中M是前一层的滤波器数量)。但无论哪种方式都可能有效。

在您的示例中，第一次卷积后 x 的尺寸为 1x1x3x16。这意味着一个批处理中有 16 个元素，其中每个元素的宽度和高度为 1，深度为 3。深度为 3，因为第一个卷积是使用 3 个滤波器完成的(w1 的尺寸为 1x1x1x3)。

示例中的 w2 的尺寸为 1x1x1x3，表示 3 个宽度、高度和深度为 1 的过滤器。因此过滤器的深度与输入的深度不匹配。

关于matlab - 为什么MatConvNet说数据和导数没有匹配的格式？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/37769528/

文章推荐： java - AndroidLocationPlayServiceManager 上的构建错误

文章推荐： javascript - Bootstrap 单选按钮上的 onclick 不起作用

文章推荐： java - PowerMockito 可以使用 whenNew 检查参数吗？

文章推荐： Javascript:将按键事件传递给 iframe

awk - 如果行与“foo”匹配，线上方与“bar”匹配，线下方与“baz”匹配，则删除行？
使用sed和/或awk，仅在行包含字符串“ foo”并且行之前和之后的行分别包含字符串“ bar”和“ baz”时，我才希望删除行。因此，对于此输入： blah blah foo blah bar
c# - 如何按 X% 匹配 2 个字符串(即 >90% 匹配)
例如: S1: "some filename contains few words.txt" S2:“一些文件名包含几个单词 - draft.txt” S3:“一些文件名包含几个单词 - 另一个 dr
R 合并数据帧，允许不精确的 ID 匹配(例如，附加字符 1234 匹配 ab1234)
我正在尝试处理一些非常困惑的数据。我需要通过样本 ID 合并两个包含不同类型数据的大数据框。问题是一张表的样本 ID 有许多不同的格式，但大多数都包含用于匹配其 ID 中某处所需的 ID 字符串，例如
css - 匹配 col-md 时显示 div，匹配 col-sm 时不显示
我想在匹配特定屏幕尺寸时显示特定图像。在这种情况下，对于 Bootstrap ，我使用 col-xx-## 作为我的选择。但似乎它并没有真正按照我认为应该的方式工作。基本思路，我想显示一种全屏图像，
apache - mod_rewrite 问题 : RewriteCond %{REQUEST_FILENAME} ! -f 匹配，即使 REQUEST_FILENAME 不应(完全)匹配
出于某种原因，这条规则 RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*
F# 匹配 ->
我想做类似的东西(Nemerle 语法) def something = match(STT) | 1 with st= "Summ" | 2 with st= "AVG" =>
JavaScript 匹配
假设这是我的代码 var str="abc=1234587;abc=19855284;abc=1234587;abc=19855284;abc=1234587;abc=19855284;abc=123
JavaScript 匹配
我怎样才能得到这个字符串的数字:'(31.5393701, -82.46235569999999)' 我已经在尝试了，但这离解决方案还很远:) text.match(/$(\d+),(\d+)$/
JavaScript 匹配
如何去除输出中的逗号 (,)？有没有更好的方法从字符串或句子中搜索 url。 alert(" http://www.cnn.com df".match(/https?:\/\/([-\w\.]+
Python - 匹配
a = ('one', 'two') b = ('ten', 'ten') z = [('four', 'five', 'six'), ('one', 'two', 'twenty')] 我正在尝试
vba - 循环遍历行和列时的索引/匹配
我已经编写了以下代码，我希望用它来查找从第 21 列到另一张表中最后一行的值，并根据这张表中 A 列和另一张表中 B 列中的值将它们返回到这张表床单。当我使用下面的代码时，我得到一个工作表错误。你能
Excel 匹配 IF 语句未正确评估
我在以下结构中有两列 A B 1 49 4922039670 我已经能够评估 =LEN(A1)如2 , =LEFT(B1,2)如49 , 和 =LEFT(B1,LEN(A1)
基于行首的 Vim 匹配
我有一个文件，其中一行可以以 + 开头, -或 * .在其中一些行之间可以有以字母或数字(一般文本)开头的行(也包含这些字符，但不在第 1 列中!)。知道这一点，设置匹配和突出显示机制的最简单方法是
正则表达式:匹配，但如果在评论中则不匹配
我有一个数据字段文件，其中可能包含注释，如下所示: id, data, data, data 101 a, b, c 102 d, e, f 103 g, h, i // has to do with
匹配 url 的正则表达式模式
我有以下模式:/^\/(?P.+)$/匹配:/url . 我的问题是它也匹配 /url/page ，如何忽略/在这个正则表达式中？该模式应该: 模式匹配:/url 模式不匹配:/url/page 提
r - R中多维度的聚类/匹配
我有一个非常庞大且复杂的数据集，其中包含许多对公司的观察。公司的一些观察是多余的，我需要制作一个键来将多余的观察映射到一个单独的观察。然而，判断他们是否真的代表同一家公司的唯一方法是通过各种变量的相似
xpath 匹配 - 查找值不在值集中的标签是否存在
我有以下 XML A B C 我想查找 if not(exists(//Record/subRecord
javascript - 匹配/不匹配的正则表达式上没有出现警报框？
我制作了一个正则表达式来验证潜在的比特币地址，现在当我单击报价按钮时，我希望根据正则表达式检查表单中输入的值，但它不起作用。 https://jsfiddle.net/arkqdc8a/5/ var
sql - 检查支架是否平衡/匹配
我有一些 MS Word 文档，我已将其全部内容转移到 SQL 表中。内容包含多个方括号和大括号，例如 [{a} as at [b],] {c,} {d,} etc 我需要进行检查以确保括号平衡/匹
JavaScript Unicode 匹配
我正在使用 Node.js 从 XML 文件读取数据。但是当我尝试将文件中的数据与文字进行比较时，它不匹配，即使它看起来相同: const parser: xml2js.Parser = new

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

matlab - 为什么MatConvNet说数据和导数没有匹配的格式？