gpt4 book ai didi

python - Caffe:学习简单线性函数时损失极高

转载 作者:太空宇宙 更新时间:2023-11-03 17:37:13 24 4
gpt4 key购买 nike

我正在尝试训练神经网络来学习函数 y = x1 + x2 + x3 。目的是尝试使用 Caffe,以便更好地学习和理解它。所需的数据在Python中综合生成并作为lmdb数据库文件写入内存。

数据生成代码:

import numpy as np
import lmdb
import caffe

Ntrain = 100
Ntest = 20
K = 3
H = 1
W = 1

Xtrain = np.random.randint(0,1000, size = (Ntrain,K,H,W))
Xtest = np.random.randint(0,1000, size = (Ntest,K,H,W))

ytrain = Xtrain[:,0,0,0] + Xtrain[:,1,0,0] + Xtrain[:,2,0,0]
ytest = Xtest[:,0,0,0] + Xtest[:,1,0,0] + Xtest[:,2,0,0]

env = lmdb.open('expt/expt_train')

for i in range(Ntrain):
datum = caffe.proto.caffe_pb2.Datum()
datum.channels = Xtrain.shape[1]
datum.height = Xtrain.shape[2]
datum.width = Xtrain.shape[3]
datum.data = Xtrain[i].tobytes()
datum.label = int(ytrain[i])
str_id = '{:08}'.format(i)

with env.begin(write=True) as txn:
txn.put(str_id.encode('ascii'), datum.SerializeToString())


env = lmdb.open('expt/expt_test')

for i in range(Ntest):
datum = caffe.proto.caffe_pb2.Datum()
datum.channels = Xtest.shape[1]
datum.height = Xtest.shape[2]
datum.width = Xtest.shape[3]
datum.data = Xtest[i].tobytes()
datum.label = int(ytest[i])
str_id = '{:08}'.format(i)

with env.begin(write=True) as txn:
txn.put(str_id.encode('ascii'), datum.SerializeToString())

Solver.prototext 文件:

net: "expt/expt.prototxt"

display: 1
max_iter: 200
test_iter: 20
test_interval: 100

base_lr: 0.000001
momentum: 0.9
# weight_decay: 0.0005

lr_policy: "inv"
# gamma: 0.5
# stepsize: 10
# power: 0.75

snapshot_prefix: "expt/expt"
snapshot_diff: true

solver_mode: CPU
solver_type: SGD

debug_info: true

咖啡模型:

name: "expt"


layer {
name: "Expt_Data_Train"
type: "Data"
top: "data"
top: "label"

include {
phase: TRAIN
}

data_param {
source: "expt/expt_train"
backend: LMDB
batch_size: 1
}
}


layer {
name: "Expt_Data_Validate"
type: "Data"
top: "data"
top: "label"

include {
phase: TEST
}

data_param {
source: "expt/expt_test"
backend: LMDB
batch_size: 1
}
}


layer {
name: "IP"
type: "InnerProduct"
bottom: "data"
top: "ip"

inner_product_param {
num_output: 1

weight_filler {
type: 'constant'
}

bias_filler {
type: 'constant'
}
}
}


layer {
name: "Loss"
type: "EuclideanLoss"
bottom: "ip"
bottom: "label"
top: "loss"
}

我得到的测试数据的损失是 233,655 。这是令人震惊的,因为损失比训练和测试数据集中的数字大三个数量级。此外,要学习的函数是一个简单的线性函数。我似乎无法弄清楚代码中有什么问题。非常感谢任何建议/意见。

最佳答案

在这种情况下产生的损失很大,因为 Caffe 只接受 uint8 格式的数据(即 datum.data)和标签(datum.label ),采用 int32 格式。然而,对于标签,numpy.int64 格式似乎也有效。我认为 datum.data 仅以 uint8 格式被接受,因为 Caffe 主要是为计算机视觉任务开发的,其中输入是图像,其 RGB 值在 [0,255] 范围内。 uint8 可以使用最少的内存来捕获此内容。我对数据生成代码进行了以下更改:

Xtrain = np.uint8(np.random.randint(0,256, size = (Ntrain,K,H,W)))
Xtest = np.uint8(np.random.randint(0,256, size = (Ntest,K,H,W)))

ytrain = int(Xtrain[:,0,0,0]) + int(Xtrain[:,1,0,0]) + int(Xtrain[:,2,0,0])
ytest = int(Xtest[:,0,0,0]) + int(Xtest[:,1,0,0]) + int(Xtest[:,2,0,0])

在尝试了网络参数(学习率、迭代次数等)之后,我得到了 10^(-6) 数量级的错误,我认为这非常好!

关于python - Caffe:学习简单线性函数时损失极高,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31055033/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com