gpt4 book ai didi

python - 我不能使用其他人的模型在生成器中生成我的输入。我该如何解决?

转载 作者:行者123 更新时间:2023-11-28 16:56:42 25 4
gpt4 key购买 nike

我正在尝试使用预训练的 BERT 模型在 SQuAD v1.1 数据集上训练神经网络。有人建议我先获取 BERT 模型的输出,然后将它们作为输入输入到我的神经网络中。由于数据量很大,我觉得我需要创建一个生成器,然后我的神经网络就可以在上面拟合:

# @title Preparation
!pip install -q keras-bert
!wget -q https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip
!unzip -o uncased_L-12_H-768_A-12.zip
import os
pretrained_path = 'uncased_L-12_H-768_A-12'
config_path = os.path.join(pretrained_path, 'bert_config.json')
checkpoint_path = os.path.join(pretrained_path, 'bert_model.ckpt')
vocab_path = os.path.join(pretrained_path, 'vocab.txt')
# TF_KERAS must be added to environment variables in order to use TPU
os.environ['TF_KERAS'] = '1'
import codecs
from keras_bert import load_trained_model_from_checkpoint
token_dict = {}
with codecs.open(vocab_path, 'r', 'utf8') as reader:
for line in reader:
token = line.strip()
token_dict[token] = len(token_dict)
model = load_trained_model_from_checkpoint(config_path, checkpoint_path)
import numpy as np
from keras_bert import Tokenizer
tokenizer = Tokenizer(token_dict)
def tokenize(text):
tokens = tokenizer.tokenize(text)
indices, segments = tokenizer.encode(first=text, max_len=512)
return indices,segments
def feature_extraction(texts):
return_values = []
for text_ in texts:
try:
text_.split(" ")
except AttributeError as e:
raise TypeError("Expected array of strings.")
try:
indices,segments = tokenize(text_)
predicts = model.predict([np.array([indices] * 8), np.array([segments] * 8)])[0]
return_values.append(predicts)
except ValueError as v:
print(v)
return_values = np.array(return_values)
return return_values
print(feature_extraction(text_array).shape)
def batch_generator(dataframe,batch_size):
while True:
batch = dataframe.sample(n=batch_size)
try:
batch_features = feature_extraction(batch["question"].values)
except ValueError as v:
print("Oops, I'm getting a ValueError for batch_features.")
print(v)
try:
batch_targets = batch["answer_start"]
except ValueError as v:
print("Oops, I'm getting a ValueError for batch_targets.")
print(v)

yield batch_features,batch_targets

当我向它提供测试数据时,这会起作用:

def batch_generator(dataframe,batch_size):
while True:
batch = dataframe.sample(n=batch_size)
try:
batch_features = feature_extraction(batch["question"].values)
except ValueError as v:
print("Oops, I'm getting a ValueError for batch_features.")
print(v)
try:
batch_targets = batch["answer_start"]
except ValueError as v:
print("Oops, I'm getting a ValueError for batch_targets.")
print(v)

yield batch_features,batch_targets

当我使用这个测试代码时这有效:

testDataframe = pd.DataFrame({"question":["Does she sell seashells by the seashore?"],"answer":["She sells seashells by the seashore"],"answer_start":[0]})
for x,y in batch_generator(testDataframe,1):
print (x)
print (y)
break

输出:

[[[-0.11251544 -0.09277309 0.04996187 ... -0.43535435 0.23852573 0.3206718 ] [ 0.35688528 0.43881682 -0.1390086 ... -0.32458037 0.64422214 -0.11743623] [ 0.6213926 -0.9945548 0.07564903 ... -0.87357795 0.2069801 -0.25303575] ... [-0.06796454 -0.24819699 -0.25508618 ... 0.20477912 0.36703664 0.04691853] [ 0.15030818 -0.05989693 0.17198643 ... 0.19960165 0.0324061 -0.31075317] [ 0.05091426 -0.14167279 0.18194658 ... 0.12112649 0.05029908 -0.15253511]]] 0 0 Name: answer_start, dtype: int64

我像这样创建和编译我的神经网络和输入:

import json
import re
#regex = re.compile(r'\W+')
import json
import re
#regex = re.compile(r'\W+')
def readFile(filename):
with open(filename) as file:
fields = []
JSON = json.loads(file.read())
articles = []
for article in JSON["data"]:
articleTitle = article["title"]
article_body = []
for paragraph in article["paragraphs"]:
paragraphContext = paragraph["context"]
article_body.append(paragraphContext)
for qas in paragraph["qas"]:
question = qas["question"]
answer = qas["answers"][0]
fields.append({"question":question,"answer_text":answer["text"],"answer_start":answer["answer_start"],"paragraph_context":paragraphContext,"article_title":articleTitle})
article_body = "\\n".join(article_body)
article = {"title":articleTitle,"body":article_body}
articles.append(article)
fields = pd.DataFrame(fields)
#fields["question"] = fields["question"].str.replace(regex," ")
assert not (fields["question"].str.contains("catalanswhat").any())
#fields["paragraph_context"] = fields["paragraph_context"].str.replace(regex," ")
#fields["answer_text"] = fields["answer_text"].str.replace(regex," ")
assert not (fields["paragraph_context"].str.contains("catalanswhat").any())
fields["article_title"] = fields["article_title"].str.replace("_"," ")
assert not (fields["article_title"].str.contains("catalanswhat").any())
return fields,articles
trainingData,trainingArticles = readFile("train-v1.1.json")
answers_network = Sequential()
answers_network.add(Dense(32,input_shape=(512,768)))
answers_network.summary()
answers_network.compile("rmsprop","categorical_crossentropy")
answers_network_checkpoint = ModelCheckpoint('answers_network-rnn-best.h5', verbose=1, monitor='val_loss',save_best_only=True, mode='auto')
answers_network.fit_generator(batch_generator(trainingData[["question","paragraph_context","answer_start"]],100),steps_per_epoch=8)

这失败并出现错误:

Tensor Input-Token:0, specified in either feed_devices or fetch_devices was not found in the Graph

现在,Input-Token 是 BERT 模型中输入层之一的名称。

我认为 TensorFlow 暗示 BERT 模型使用的图表与我的模型不同。

显然,BERT 模型使用自定义层和激活函数,因此对模型进行深度复制可能不是最佳做法。

我该怎么办?

编辑:此处提供了我的 train-v1.1.json 数据集的副本:https://drive.google.com/file/d/1qQbrQnH3WkibtXIHFA88gJuGESvyz-Ag/view?usp=sharing

最佳答案

选项1

先生成数据并保存。稍后您使用保存的数据进行训练。

看起来像这样做:

features = feature_extraction(text_array)
np.save('features.npy', features) #or not...

然后再适应这个数组:

features = np.load('features.npy')    
new_model.fit(features, targets, ...)

如果数据对你的内存来说太大了(模型没有问题,只需设置一个合适的batch_size。生成特征时bert模型也是如此),我的意思是,整个数据无法满足您的内存,那么您可以单独保存批处理:

for i in range(batches):
batch = text_array[i*batch_size : (i+1)*batch_size)
features = feature_extraction(batch)
np.save('batch' + str(i) + '.npy', features)

然后您的生成器加载这些批处理:

while True:
for i in range(batches):
batch = np.load('batch' + str(i) + '.npy')

选项 2

让模型使用相同的图,创建一个单一的大模型:

bertInputs = Input(shape_for_bert_input)
bertOutputs = bert_model(bertInputs)
yourOutputs = your_model(bertOutputs)

fullModel = Model(bertInputs, yourOutputs)

直接从数据帧使用生成器进行训练,bert 预测将作为模型的一部分自动发生。


或者,如果您想在摘要中明确查看所有图层:

bertOutputs = bert_model.output
yourOutputs = Dense(....)(bertOutputs)

fullModel = Model(bert_model.input, yourOutputs)

关于python - 我不能使用其他人的模型在生成器中生成我的输入。我该如何解决?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57703276/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com