gpt4 book ai didi

python - 如何以干净有效的方式在pytorch中获得小批量?

转载 作者:IT老高 更新时间:2023-10-28 22:02:59 26 4
gpt4 key购买 nike

我正在尝试做一件简单的事情,即使用 Torch 使用随机梯度下降 (SGD) 训练线性模型:

import numpy as np

import torch
from torch.autograd import Variable

import pdb

def get_batch2(X,Y,M,dtype):
X,Y = X.data.numpy(), Y.data.numpy()
N = len(Y)
valid_indices = np.array( range(N) )
batch_indices = np.random.choice(valid_indices,size=M,replace=False)
batch_xs = torch.FloatTensor(X[batch_indices,:]).type(dtype)
batch_ys = torch.FloatTensor(Y[batch_indices]).type(dtype)
return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

def poly_kernel_matrix( x,D ):
N = len(x)
Kern = np.zeros( (N,D+1) )
for n in range(N):
for d in range(D+1):
Kern[n,d] = x[n]**d;
return Kern

## data params
N=5 # data set size
Degree=4 # number dimensions/features
D_sgd = Degree+1
##
x_true = np.linspace(0,1,N) # the real data points
y = np.sin(2*np.pi*x_true)
y.shape = (N,1)
## TORCH
dtype = torch.FloatTensor
# dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU
X_mdl = poly_kernel_matrix( x_true,Degree )
X_mdl = Variable(torch.FloatTensor(X_mdl).type(dtype), requires_grad=False)
y = Variable(torch.FloatTensor(y).type(dtype), requires_grad=False)
## SGD mdl
w_init = torch.zeros(D_sgd,1).type(dtype)
W = Variable(w_init, requires_grad=True)
M = 5 # mini-batch size
eta = 0.1 # step size
for i in range(500):
batch_xs, batch_ys = get_batch2(X_mdl,y,M,dtype)
# Forward pass: compute predicted y using operations on Variables
y_pred = batch_xs.mm(W)
# Compute and print loss using operations on Variables. Now loss is a Variable of shape (1,) and loss.data is a Tensor of shape (1,); loss.data[0] is a scalar value holding the loss.
loss = (1/N)*(y_pred - batch_ys).pow(2).sum()
# Use autograd to compute the backward pass. Now w will have gradients
loss.backward()
# Update weights using gradient descent; w1.data are Tensors,
# w.grad are Variables and w.grad.data are Tensors.
W.data -= eta * W.grad.data
# Manually zero the gradients after updating weights
W.grad.data.zero_()

#
c_sgd = W.data.numpy()
X_mdl = X_mdl.data.numpy()
y = y.data.numpy()
#
Xc_pinv = np.dot(X_mdl,c_sgd)
print('J(c_sgd) = ', (1/N)*(np.linalg.norm(y-Xc_pinv)**2) )
print('loss = ',loss.data[0])

代码运行良好,尽管我的 get_batch2 方法看起来真的很愚蠢/天真,这可能是因为我是 pytorch 的新手,但我还没有找到一个讨论如何检索数据批处理的好地方.我浏览了他们的教程( http://pytorch.org/tutorials/beginner/pytorch_with_examples.html )和数据集( http://pytorch.org/tutorials/beginner/data_loading_tutorial.html ),但没有成功。教程似乎都假设一开始就已经有了批和批大小,然后继续使用该数据进行训练而不更改它(具体看 http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-variables-and-autograd)。

所以我的问题是我真的需要将我的数据转回 numpy 以便我可以获取它的一些随机样本,然后将其转回带有变量的 pytorch 以便能够在内存中进行训练吗?有没有办法用torch获取小批量?

我查看了 torch 提供的一些功能,但没有运气:

#pdb.set_trace()
#valid_indices = torch.arange(0,N).numpy()
#valid_indices = np.array( range(N) )
#batch_indices = np.random.choice(valid_indices,size=M,replace=False)
#indices = torch.LongTensor(batch_indices)
#batch_xs, batch_ys = torch.index_select(X_mdl, 0, indices), torch.index_select(y, 0, indices)
#batch_xs,batch_ys = torch.index_select(X_mdl, 0, indices), torch.index_select(y, 0, indices)

即使我提供的代码运行良好,但我担心它不是一个有效的实现,而且如果我使用 GPU,速度会进一步降低(因为我猜它会将东西放入内存然后获取它们回去把它们放 GPU 那样很傻)。


我根据建议使用 torch.index_select() 的答案实现了一个新的:

def get_batch2(X,Y,M):
'''
get batch for pytorch model
'''
# TODO fix and make it nicer, there is pytorch forum question
#X,Y = X.data.numpy(), Y.data.numpy()
X,Y = X, Y
N = X.size()[0]
batch_indices = torch.LongTensor( np.random.randint(0,N+1,size=M) )
pdb.set_trace()
batch_xs = torch.index_select(X,0,batch_indices)
batch_ys = torch.index_select(Y,0,batch_indices)
return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

然而,这似乎有问题,因为如果 X,Y 不是变量,它就不起作用......这真的很奇怪。我将此添加到 pytorch 论坛:https://discuss.pytorch.org/t/how-to-get-mini-batches-in-pytorch-in-a-clean-and-efficient-way/10322

现在我正在努力使这项工作适用于 gpu。我的最新版本:

def get_batch2(X,Y,M,dtype):
'''
get batch for pytorch model
'''
# TODO fix and make it nicer, there is pytorch forum question
#X,Y = X.data.numpy(), Y.data.numpy()
X,Y = X, Y
N = X.size()[0]
if dtype == torch.cuda.FloatTensor:
batch_indices = torch.cuda.LongTensor( np.random.randint(0,N,size=M) )# without replacement
else:
batch_indices = torch.LongTensor( np.random.randint(0,N,size=M) ).type(dtype) # without replacement
pdb.set_trace()
batch_xs = torch.index_select(X,0,batch_indices)
batch_ys = torch.index_select(Y,0,batch_indices)
return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

错误:

RuntimeError: tried to construct a tensor from a int sequence, but found an item of type numpy.int64 at index (0)

我不明白,我真的必须这样做吗:

ints = [ random.randint(0,N) for i i range(M)]

获取整数?

如果数据可以是变量,这也是理想的。似乎 torch.index_select 不适用于 Variable 类型的数据。

这个整数列表仍然不起作用:

TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor), but expected one of:
* (torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
* (torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
* (float beta, torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
* (torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
* (float beta, torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
* (torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
* (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)
* (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)
didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)

最佳答案

如果我正确理解了您的代码,则您的 get_batch2 函数似乎正在从您的数据集中随机抽取小批量,而没有跟踪您在某个时期已经使用过哪些索引。此实现的问题在于它可能不会使用您的所有数据。

我通常进行批处理的方式是使用 torch.randperm(N) 创建所有可能顶点的随机排列,并批量循环它们。例如:

n_epochs = 100 # or whatever
batch_size = 128 # or whatever

for epoch in range(n_epochs):

# X is a torch Variable
permutation = torch.randperm(X.size()[0])

for i in range(0,X.size()[0], batch_size):
optimizer.zero_grad()

indices = permutation[i:i+batch_size]
batch_x, batch_y = X[indices], Y[indices]

# in case you wanted a semi-full example
outputs = model.forward(batch_x)
loss = lossfunction(outputs,batch_y)

loss.backward()
optimizer.step()

如果您喜欢复制和粘贴,请确保在 epoch 循环开始之前的某处定义优化器、模型和损失函数。

关于您的错误,请尝试使用 torch.from_numpy(np.random.randint(0,N,size=M)).long() 而不是 torch.LongTensor( np.random.randint(0,N,size=M))。我不确定这是否会解决您遇到的错误,但它会解决 future 的错误。

关于python - 如何以干净有效的方式在pytorch中获得小批量?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45113245/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com