gpt4 book ai didi

deep-learning - 如何修复 "RuntimeError: Function AddBackward0 returned an invalid gradient at index 1 - expected type torch.FloatTensor but got torch.LongTensor"

转载 作者:行者123 更新时间:2023-12-04 17:34:44 25 4
gpt4 key购买 nike

我在运行机器翻译代码时遇到了这个错误。

RuntimeError Traceback (most recent call last) in 5 decoder = Decoder(len(out_vocab), embed_size, num_hiddens, num_layers, 6 attention_size, drop_prob) ----> 7 train(encoder, decoder, dataset, lr, batch_size, num_epochs)

in train(encoder, decoder, dataset, lr, batch_size, num_epochs) 13 dec_optimizer.zero_grad() 14 l = batch_loss(encoder, decoder, X, Y, loss) ---> 15 l.backward() 16 enc_optimizer.step() 17 dec_optimizer.step()

/usr/lib64/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph) 105 products. Defaults to False. 106 """ --> 107 torch.autograd.backward(self, gradient, retain_graph, create_graph) 108 109 def register_hook(self, hook):

/usr/lib64/python3.6/site-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables) 91 Variable._execution_engine.run_backward( 92 tensors, grad_tensors, retain_graph, create_graph, ---> 93 allow_unreachable=True) # allow_unreachable flag 94 95

RuntimeError: Function AddBackward0 returned an invalid gradient at index 1 - expected type torch.FloatTensor but got torch.LongTensor

我认为错误出在 batch_loss 函数中。但我不知道为什么,也无法修复它。

def batch_loss(encoder, decoder, X, Y, loss):
batch_size = X.shape[0]
enc_state = None
enc_outputs, enc_state = encoder(X, enc_state)
# 初始化解码器的隐藏状态
dec_state = decoder.begin_state(enc_state)
# 解码器在最初时间步的输入是BOS
dec_input = torch.tensor([out_vocab.stoi[BOS]] * batch_size)
# 我们将使用掩码变量mask来忽略掉标签为填充项PAD的损失
mask, num_not_pad_tokens = torch.ones(batch_size), 0
l = torch.tensor([0])
for y in Y.t():
dec_output, dec_state = decoder(dec_input, dec_state, enc_outputs)
l = l + (mask * loss(dec_output, y)).sum()
dec_input = y # 使用强制教学
num_not_pad_tokens += mask.sum().item()
# 当遇到EOS时,序列后面的词将均为PAD,相应位置的掩码设成0
mask = mask * (y != out_vocab.stoi[EOS]).float()
return l / num_not_pad_tokens

def train(encoder, decoder, dataset, lr, batch_size, num_epochs):
d2lt.params_init(encoder, init=nn.init.xavier_uniform_)
d2lt.params_init(decoder, init=nn.init.xavier_uniform_)

enc_optimizer = optim.Adam(encoder.parameters(), lr=lr)
dec_optimizer = optim.Adam(decoder.parameters(), lr=lr)
loss = nn.CrossEntropyLoss(reduction='none')
data_iter = tdata.DataLoader(dataset, batch_size, shuffle=True)
for epoch in range(num_epochs):
l_sum = 0.0
for X, Y in data_iter:
enc_optimizer.zero_grad()
dec_optimizer.zero_grad()
l = batch_loss(encoder, decoder, X, Y, loss)
l.backward()
enc_optimizer.step()
dec_optimizer.step()
l_sum += l.item()
if (epoch + 1) % 10 == 0:
print("epoch %d, loss %.3f" % (epoch + 1, l_sum / len(data_iter)))

期待正面的回复。

最佳答案

感谢 Proyag。只需将 l = torch.tensor([0]) 替换为 l = torch.tensor([0], dtype=torch.float) 即可解决我的问题。

关于deep-learning - 如何修复 "RuntimeError: Function AddBackward0 returned an invalid gradient at index 1 - expected type torch.FloatTensor but got torch.LongTensor",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57142401/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com