gpt4 book ai didi

torch - 如何使用 DataParallel 并行化 Pytorch 中的 RNN 函数

转载 作者:行者123 更新时间:2023-12-02 02:34:04 25 4
gpt4 key购买 nike

这是一个用于运行基于字符的语言生成的 RNN 模型:

class RNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size, n_layers):
super(RNN, self).__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.n_layers = n_layers

self.encoder = nn.Embedding(input_size, hidden_size)
self.GRU = nn.GRU(hidden_size, hidden_size, n_layers, batch_first=True)
self.decoder = nn.Linear(hidden_size, output_size)


def forward(self, input, batch_size):
self.init_hidden(batch_size)
input = self.encoder(input)
output, self.hidden = self.GRU(input, self.hidden)
output = self.decoder(output.view(batch_size, self.hidden_size))
return output

def init_hidden(self, batch_size):
self.hidden = Variable(torch.randn(self.n_layers, batch_size, self.hidden_size).cuda())

我使用 DataParallel 实例化模型,以将批量输入拆分到 4 个 GPU 上:

net = torch.nn.DataParallel(RNN(n_chars, hidden_size, n_chars, n_layers)).cuda()

这是full code .

不幸的是,DataParallel要求输入将batch_size作为第一个维度,但GRU函数期望隐藏张量将batch_size作为第二个维度:

output, self.hidden = self.GRU(input, self.hidden)

代码原样抛出以下错误(请注意打印输出显示编码器在 4 个 GPU 上正确执行):

...
forward function: encoding input of shape: (16L, 1L)
forward function: encoding input of shape: (16L, 1L)
forward function: encoding input of shape: (16L,
forward function: encoding input of shape:

forward function: GRU processing input of shape:
1L)
( (16L, 16L1L, 1L), 100L)
forward function: GRU processing input of shape:
(16L, 1L,
forward function: GRU processing input of shape:100L)
(16L
forward function: GRU processing input of shape:, 1L, 100L) (
16L, 1L, 100L)

Traceback (most recent call last):
File "gru2.py", line 166, in <module>
output = net(c, batch_size)
File "/root/miniconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/root/miniconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 61, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/root/miniconda2/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 71, in parallel_apply
return parallel_apply(replicas, inputs, kwargs)
File "/root/miniconda2/lib/python2.7/site-packages/torch/nn/parallel/parallel_apply.py", line 45, in parallel_apply
raise output
RuntimeError: Expected hidden size (2, 16L, 100), got (2L, 64L, 100L)

此处模型有 2 层,batch_size=64,hidden_​​size = 100。

如何在前向函数中并行化 GRU 操作?

最佳答案

您可以简单地设置参数dim=1,例如

net = torch.nn.DataParallel(RNN(n_chars, hidden_size, n_chars, n_layers), dim=1).cuda()

关于torch - 如何使用 DataParallel 并行化 Pytorch 中的 RNN 函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44595338/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com