gpt4 book ai didi

machine-learning - 预测句子中的缺失单词 - 自然语言处理模型

转载 作者:行者123 更新时间:2023-11-30 08:29:51 26 4
gpt4 key购买 nike

我有以下句子:

I want to ____ the car because it is cheap.

我想使用 NLP 模型来预测丢失的单词。我应该使用什么 NLP 模型?谢谢。

最佳答案

TL;DR

试试这个:https://github.com/huggingface/pytorch-pretrained-BERT

首先你必须正确设置它

pip install -U pytorch-pretrained-bert

然后您可以使用 BERT 算法中的“屏蔽语言模型”,例如

import torch
from pytorch_pretrained_bert import BertTokenizer, BertModel, BertForMaskedLM

# OPTIONAL: if you want to have more information on what's happening, activate the logger as follows
import logging
logging.basicConfig(level=logging.INFO)

# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

text = '[CLS] I want to [MASK] the car because it is cheap . [SEP]'
tokenized_text = tokenizer.tokenize(text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tokenized_text)

# Create the segments tensors.
segments_ids = [0] * len(tokenized_text)

# Convert inputs to PyTorch tensors
tokens_tensor = torch.tensor([indexed_tokens])
segments_tensors = torch.tensor([segments_ids])

# Load pre-trained model (weights)
model = BertForMaskedLM.from_pretrained('bert-base-uncased')
model.eval()

# Predict all tokens
with torch.no_grad():
predictions = model(tokens_tensor, segments_tensors)

masked_index = tokenized_text.index("[MASK]")

predicted_index = torch.argmax(predictions[0, masked_index]).item()
predicted_token = tokenizer.convert_ids_to_tokens([predicted_index])[0]

print(predicted_token)

[输出]:

buy

要真正理解为什么需要 [CLS][MASK] 和段张量,请仔细阅读本文,https://arxiv.org/abs/1810.04805

如果你很懒,你可以阅读 Lilian Weng 的这篇精彩博文,https://lilianweng.github.io/lil-log/2019/01/31/generalized-language-models.html

除了BERT之外,还有很多其他模型可以执行填空任务。请查看 pytorch-pretrained-BERT 存储库中的其他模型,但更重要的是深入研究“语言建模”任务,即根据历史记录预测下一个单词的任务。

关于machine-learning - 预测句子中的缺失单词 - 自然语言处理模型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54978443/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com