- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
您好,我正在尝试修改 mnist 示例以使其与我的数据集相匹配。我只尝试使用 mlp 示例,但它给出了一个奇怪的错误。
数据集是一个有 2100 行和 17 列的矩阵,输出应该是 16 个可能的类别之一。该错误似乎发生在培训的第二阶段。模型构建正确(日志信息已确认)。
错误日志如下:
ValueError: y_i value out of bounds
Apply node that caused the error:
CrossentropySoftmaxArgmax1HotWithBias(Dot22.0, b, targets)
Toposort index: 33
Inputs types: [TensorType(float64, matrix), TensorType(float64, vector), >TensorType(int32, vector)]
Inputs shapes: [(100, 17), (17,), (100,)]
Inputs strides: [(136, 8), (8,), (4,)]
Inputs values: ['not shown', 'not shown', 'not shown']
Outputs clients: [[Sum{acc_dtype=float64}(CrossentropySoftmaxArgmax1HotWithBias.0)], [CrossentropySoftmax1HotWithBiasDx(Assert{msg='
sm
anddy
do not have the same shape.'}.0, CrossentropySoftmaxArgmax1HotWithBias.1, targets)], []]HINT: Re-running with most Theano optimization disabled could give you a >back-trace of when this node was created. This can be done with by >setting the Theano flag 'optimizer=fast_compile'. If that does not work, >Theano optimizations can be disabled with 'optimizer=None'. HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
代码如下:
def build_mlp(input_var=None):
l_in = lasagne.layers.InputLayer(shape=(None, 16),
input_var=input_var)
# Apply 20% dropout to the input data:
l_in_drop = lasagne.layers.DropoutLayer(l_in, p=0.2)
# Add a fully-connected layer of 800 units, using the linear rectifier, and
# initializing weights with Glorot's scheme (which is the default anyway):
l_hid1 = lasagne.layers.DenseLayer(
l_in_drop, num_units=10,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())
# We'll now add dropout of 50%:
l_hid1_drop = lasagne.layers.DropoutLayer(l_hid1, p=0.5)
# Another 800-unit layer:
l_hid2 = lasagne.layers.DenseLayer(
l_hid1_drop, num_units=10,
nonlinearity=lasagne.nonlinearities.rectify)
# 50% dropout again:
l_hid2_drop = lasagne.layers.DropoutLayer(l_hid2, p=0.5)
# Finally, we'll add the fully-connected output layer, of 10 softmax units:
l_out = lasagne.layers.DenseLayer(
l_hid2_drop, num_units=17,
nonlinearity=lasagne.nonlinearities.softmax)
# Each layer is linked to its incoming layer(s), so we only need to pass
# the output layer to give access to a network in Lasagne:
return l_out
def main(model='mlp', num_epochs=300):
# Load the dataset
print("Loading data...")
X_train, y_train, X_val, y_val, X_test, y_test = load_dataset()
# Prepare Theano variables for inputs and targets
input_var = T.matrix('inputs')
target_var = T.ivector('targets')
# Create neural network model (depending on first command line parameter)
print("Building model and compiling functions...")
if model == 'cnn':
network = build_cnn(input_var)
elif model == 'mlp':
network = build_mlp(input_var)
elif model == 'lstm':
network = build_lstm(input_var)
else:
print("Unrecognized model type %r." % model)
# Create a loss expression for training, i.e., a scalar objective we want
# to minimize (for our multi-class problem, it is the cross-entropy loss):
prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.categorical_crossentropy(prediction, target_var)
loss = loss.mean()
# We could add some weight decay as well here, see lasagne.regularization.
# Create update expressions for training, i.e., how to modify the
# parameters at each training step. Here, we'll use Stochastic Gradient
# Descent (SGD) with Nesterov momentum, but Lasagne offers plenty more.
params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(
loss, params, learning_rate=0.01, momentum=0.9)
# Create a loss expression for validation/testing. The crucial difference
# here is that we do a deterministic forward pass through the network,
# disabling dropout layers.
test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.categorical_crossentropy(test_prediction,
target_var)
test_loss = test_loss.mean()
# As a bonus, also create an expression for the classification accuracy:
test_acc = T.mean(T.eq(T.argmax(test_prediction, axis=1), target_var),
dtype=theano.config.floatX)
# Compile a function performing a training step on a mini-batch (by giving
# the updates dictionary) and returning the corresponding training loss:
train_fn = theano.function([input_var, target_var], loss, updates=updates)
# Compile a second function computing the validation loss and accuracy:
val_fn = theano.function([input_var, target_var], [test_loss, test_acc])
# Finally, launch the training loop.
print("Starting training...")
# We iterate over epochs:
for epoch in range(num_epochs):
# In each epoch, we do a full pass over the training data:
train_err = 0
train_batches = 0
start_time = time.time()
for batch in iterate_minibatches(X_train, y_train, 100, shuffle=True):
inputs, targets = batch
train_err += train_fn(inputs, targets)
train_batches += 1
# And a full pass over the validation data:
val_err = 0
val_acc = 0
val_batches = 0
for batch in iterate_minibatches(X_val, y_val, 100, shuffle=False):
inputs, targets = batch
err, acc = val_fn(inputs, targets)
val_err += err
val_acc += acc
val_batches += 1
# Then we print the results for this epoch:
print("Epoch {} of {} took {:.3f}s".format(
epoch + 1, num_epochs, time.time() - start_time))
print(" training loss:\t\t{:.6f}".format(train_err / train_batches))
print(" validation loss:\t\t{:.6f}".format(val_err / val_batches))
print(" validation accuracy:\t\t{:.2f} %".format(
val_acc / val_batches * 100))
# After training, we compute and print the test error:
test_err = 0
test_acc = 0
test_batches = 0
for batch in iterate_minibatches(X_test, y_test, 100, shuffle=False):
inputs, targets = batch
err, acc = val_fn(inputs, targets)
test_err += err
test_acc += acc
test_batches += 1
print("Final results:")
print(" test loss:\t\t\t{:.6f}".format(test_err / test_batches))
print(" test accuracy:\t\t{:.2f} %".format(
test_acc / test_batches * 100))
最佳答案
我发现了问题:我的数据集没有针对每个目标的输出,因为它太小了!有 17 个目标输出,但我的数据集只有 16 个不同的输出,并且缺少第 17 个输出的示例。
为了解决这个问题,只要把softmax改成rectify,
来自这里:
l_out = lasagne.layers.DenseLayer(
l_hid2_drop, num_units=17,
nonlinearity=lasagne.nonlinearities.softmax)
为此:
l_out = lasagne.layers.DenseLayer(
l_hid2_drop, num_units=17,
nonlinearity=lasagne.nonlinearities.rectify)
关于python - Lasagne mlp 目标越界,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33363582/
我最近才开始学习 Clojure,所以很抱歉,如果这有点初级: 有人可以向我解释一下两者之间的区别吗: => (def a (lazy-cat [0]
我有一些看起来像这样的数据: X = [[1,2,3,4],[01010],[-1.6]] y = [[4,2]] 我正在尝试使用 tflearn 在这些数据上训练神经网络。我使用的是 TFlearn
我的代码有问题。 我正在尝试从 .txt 文件中提取 channel 名称。我不明白为什么方法 line.split() 给我返回一个长度为 0 的数组: 有人可以帮助我吗? 这是文件.txt: --
def sigmoid(z): # complete the code z = np.asarray(z) if z.ndim == 0: return(1/(1+np.exp(-z))) e
我在访问 3d 数组内的值时遇到问题。有时它给出正确的值,但有时它给出随机的数值。数组内不存在。 import java.util.*; public class Main { public
我有一段代码,执行时会出现此错误。而且我比较新,我似乎无法解决问题。 错误:2011-09-06 12:31:06.094 ForceGauge[266:707] CoreAnimation:忽略异常
我正在尝试限制 http://www.liftdesignphoto.com/test/ 中的滚动因为它让电梯超出了界限。 有没有办法重新计算位置,使其不越界? (也许使用 %)。 谢谢 最佳答案 假
我正在尝试遍历 6 个“国际象棋”棋子的列表。每轮他们移动一个随机数量,如果他们落在另一个上,他们就会“杀死”它。 问题是,当我的 vector 中的最后一 block 杀死另一 block 时,我收
NumberPicker serviceWheel = (NumberPicker) findViewById(R.id.serviceSelector); serviceWheel.setMaxVa
我正在尝试使用 GridLayout 重现此计算器布局 但这就是我用我尝试过的代码得到的结果。 事实上,在设备上情况会变得更糟,它会削减更多必须跨越两行的最后一个相等按钮。
运行测试脚本时出现“标签越界”错误。将注释值与类数进行比较时,confusion_matrix 函数会抛出错误。在我的例子中,注释值是一个图像(560x560)和 number_of_classes
什么是 OOL(越界)代码?我在 ION 编译器中找到了它,但无法理解发生了什么。 bool CodeGeneratorShared::generateOutOfLineCode() { for
这是我正在研究的有趣的事情。 varray.c: static GLint vertices[] = {25, 25, 100, 325,
我的程序将文件读取到字节数组中,然后尝试从该文件中提取 bmp 图像。问题是我遇到了越界错误。 { public static void main( String[] args ) {
我有一个 UITableView,它由从 XML 提要解析的数据数组填充。我正在努力寻找此错误的原因,并想知道是否有人可以帮助我。该错误不会经常发生。它仅在数组数量很大时发生,例如 10-15 个对象
public class GameEntry { private String name; private int score; public GameEntry(String
我遇到了 Storyboard的问题(至少有点惊讶)。 我有一个 ViewController,它包含一个容器 View 以及各种 ImageView 。自然地,选择的 ImageView 决定了容器
我正在尝试为一些 textfield 设置动画。即在屏幕外开始动画并移动到屏幕中央。但就我而言,动画从中心开始并超出 bounds。当我在 viewWillAppear/viewDidAppear 中
closeTs在struct tic给我一个错误 - tsP=0x66 .我尝试从 oracle 条目中填充它,如果没有,我尝试分配一个值。但我在 fillFields 中访问错误.有人可以给我提示
public class Registration { public static void main(String[] args) { final String MY
我是一名优秀的程序员,十分优秀!