gpt4 book ai didi

python-3.x - Bert 预训练模型每次都会给出随机输出

转载 作者:行者123 更新时间:2023-12-04 02:36:14 27 4
gpt4 key购买 nike

我试图在 Huggingface bert 转换器之后添加一个附加层,因此我在我的 nn.Module 网络中使用了 BertForSequenceClassification 。但是,与直接加载模型相比,我发现该模型为我提供了随机输出。

模型1:

from transformers import BertForSequenceClassification

model = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = 5) # as we have 5 classes

import torch
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

input_ids = torch.tensor(tokenizer.encode(texts[0], add_special_tokens=True, max_length = 512)).unsqueeze(0) # Batch size 1

print(model(input_ids))

输出:

(tensor([[ 0.3610, -0.0193, -0.1881, -0.1375, -0.3208]],
grad_fn=<AddmmBackward>),)

模型2:

import torch
from torch import nn

class BertClassifier(nn.Module):
def __init__(self):
super(BertClassifier, self).__init__()
self.bert = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = 5)
# as we have 5 classes

# we want our output as probability so, in the evaluation mode, we'll pass the logits to a softmax layer
self.softmax = torch.nn.Softmax(dim = 1) # last dimension
def forward(self, x):
print(x.shape)
x = self.bert(x)

if self.training == False: # in evaluation mode
pass
#x = self.softmax(x)

return x

# create our model

bertclassifier = BertClassifier()

print(bertclassifier(input_ids))
torch.Size([1, 512])
torch.Size([1, 5])
(tensor([[-0.3729, -0.2192, 0.1183, 0.0778, -0.2820]],
grad_fn=<AddmmBackward>),)

它们应该是同一个型号吧。我在这里发现了类似的问题,但没有合理的解释https://github.com/huggingface/transformers/issues/2770

  1. Bert 是否有一些随机参数,如果有,如何获得可重现的输出?

  2. 为什么这两个模型给出不同的输出?我是不是做错了什么?

最佳答案

原因是由于Bert的分类器层的随机初始化造成的。如果您打印模型,您会看到

    (pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(dropout): Dropout(p=0.1, inplace=False)
(classifier): Linear(in_features=768, out_features=5, bias=True)
)

最后一层有一个分类器,该层添加在bert-base之后。现在,期望您将为下游任务训练该层。

如果您想获得更多见解:

model, li = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels = 5, output_loading_info=True) # as we have 5 classes
print(li)
{'missing_keys': ['classifier.weight', 'classifier.bias'], 'unexpected_keys': ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias'], 'error_msgs': []}

您可以看到缺少classifier.weightbias,因此每次调用BertForSequenceClassification.from_pretrained('bert-基本无大小写',num_labels = 5)

关于python-3.x - Bert 预训练模型每次都会给出随机输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61690689/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com