python - InvalidArgumentError : Received a label value of 8825 which is outside the valid range of [0, 8825) SEQ2SEQ 模型-6ren

python - InvalidArgumentError : Received a label value of 8825 which is outside the valid range of [0, 8825) SEQ2SEQ 模型

转载作者：太空宇宙更新时间：2023-11-03 21:48:06

我一直在尝试使用 Udemy 类(class) DeepLearning_NLP_Chatbot 中的 Seq2Seq 模型构建 RNN，并且我一步步跟着他，但是在训练时遇到错误:InvalidArgumentError:收到的标签值 8825 超出了 [0, 8825) 的有效范围。数据集here 。

这是数据处理数据集

# Building a Chatbot with Deep NLP.
# Importing the libraries.
import numpy as np
import tensorflow as tf
import re
import time

# ---Data Processing---#
#------------------------#

# Importing the dataset.
lines = open('movie_lines.txt', encoding = 'utf-8', errors = 'ignore').read().split('\n')
conversations = open('movie_conversations.txt', encoding = 'utf-8', errors = 'ignore').read().split('\n')

# Creating a dictionary that map each line and its id
id2line = {}
for line in lines:
   _line = line.split(' +++$+++ ')
if len(_line) == 5:
    id2line[_line[0]]= _line[4]

# Creating a list of all conversations
conversations_ids = []
for conversation in conversations[:-1]:
    _conversation = conversation.split(' +++$+++ ')[-1][1:-1].replace("'","").replace(" ","")
    conversations_ids.append(_conversation.split(','))

# Getting seperately the question and the answer
questions = []
answers = []
for conversation in conversations_ids:
    for i in range( len(conversation) - 1):
        questions.append(id2line[conversation[i]])
        answers.append(id2line[conversation[i+1]])

# Doing a first cleaning of the text
def clean_text(text):
    text = text.lower()
    text = re.sub(r"i'm", "i am", text)
    text = re.sub(r"he's", "he is", text)
    text = re.sub(r"she's", "she is", text)
    text = re.sub(r"that's", "that is", text)
    text = re.sub(r"what's", "what is", text)
    text = re.sub(r"where's", "where is", text)
    text = re.sub(r"\'ll", " will", text)
    text = re.sub(r"\'ve", " have", text)
    text = re.sub(r"\'re", " are", text)
    text = re.sub(r"\'d", " would", text)
    text = re.sub(r"won't", "will not", text)
    text = re.sub(r"can't", "cannot", text)
    text = re.sub(r"[-()\"#/@;:<>{}+=~|.?,]", "", text)
    return text

# Cleaning the questions
clean_questions = []
for question in questions:
    clean_questions.append(clean_text(question))

# Cleaning the answers
clean_answers = []
for answer in answers:
    clean_answers.append(clean_text(answer))

# Creating a dictionary that maps each word with its occurences.
word2count = {}
for question in clean_questions:
    for word in question.split():
        if word not in word2count:
            word2count[word] = 1
        else:
            word2count[word] += 1

for answer in clean_answers:
    for word in answer.split():
        if word not in word2count:
            word2count[word] = 1
        else:
            word2count[word] += 1

# Creating two dictionaries that map questions and answers word to a 
unique integer.
threshold = 20
questionsword2int = {}
word_number = 0
for word, count in word2count.items():
    if count >= threshold:
        questionsword2int[word] = word_number
        word_number += 1

answersword2int = {}
word_number = 0
for word, count in word2count.items():
    if count >= threshold:
        answersword2int[word] = word_number
        word_number += 1

# Adding the last tokens to these two dictionaries.
tokens = ['<PAD>', '<EOS>', '<SOS>', '<OUT>']
for token in tokens:
    questionsword2int[token] = len(questionsword2int) + 1
for token in tokens:
    answersword2int[token] = len(answersword2int) + 1

# Creating inverse dictionary to answerswords2int dictionary.
answersint2word = {w_i:w for w,w_i in answersword2int.items() }

# Adding End Of String token in the end of every answer.
for i in range(len(clean_answers)):
     clean_answers[i] += ' <EOS>'

# Translating all the questions and the answers into integers.
# and Replacing all the words that were filtered out to <OUT> token.
questions_into_int = []
for question in clean_questions:
    ints = []
    for word in question.split():
        if word not in questionsword2int:
            ints.append(questionsword2int['<OUT>'])
        else:
            ints.append(questionsword2int[word])
    questions_into_int.append(ints)

answers_into_int = []
for answer in clean_answers:
    ints = []
    for word in answer.split():
        if word not in answersword2int:
            ints.append(answersword2int['<OUT>'])
        else:
            ints.append(answersword2int[word])
    answers_into_int.append(ints)

# Sorting questions and answers by the length of the questions
sorted_clean_questions = []
sorted_clean_answers = []
for length in range(1, 25 + 1):
    for i in enumerate(questions_into_int):
        if length == len(i[1]):
            sorted_clean_questions.append(questions_into_int[i[0]])
            sorted_clean_answers.append(answers_into_int[i[0]])

这是构建 seq2seq 模型:

# --- Building SEQ2SEQ Model---#
#------------------------------#

# Creating placeholder for the inputs and the targets:
def model_inputs():
    inputs = tf.placeholder(tf.int32, [None, None], name = 'input')
    targets = tf.placeholder(tf.int32, [None, None], name = 'target')
    lr = tf.placeholder(tf.float32, name = 'learning_rate')
    keep_prob = tf.placeholder(tf.float32, name = 'keep_prob')
    return inputs, targets, lr, keep_prob

# Preprocessing targets:
def preprocess_targets(targets, word2int, batch_size):
    left_side = tf.fill([batch_size, 1], word2int['<SOS>'])
    right_side = tf.strided_slice(targets, [0,0], [batch_size, -1], [1,1])
    preprcessed_targets = tf.concat([left_side, right_side], 1)
    return preprcessed_targets

# Creating the Encoder RNN Layer:
def encoder_rnn_layer(rnn_inputs, rnn_size, num_layers, keep_prob, sequence_length):
    lstm = tf.contrib.rnn.BasicLSTMCell(rnn_size)
    lstm_dropout = tf.contrib.rnn.DropoutWrapper(lstm, input_keep_prob = keep_prob)
    encoder_cell = tf.contrib.rnn.MultiRNNCell([lstm_dropout] * num_layers)
    _, encoder_state = tf.nn.bidirectional_dynamic_rnn(cell_fw= encoder_cell,
                                                       cell_bw= encoder_cell,
                                                       sequence_length= sequence_length,
                                                       inputs= rnn_inputs,
                                                       dtype= tf.float32)
    return encoder_state

# Decoding the Training Set:
def decode_training_set(encoder_state, decoder_cell, decoder_embedded_input, sequence_length, decoding_scope, output_function, keep_prob, batch_size):
    attention_states = tf.zeros([batch_size, 1, decoder_cell.output_size])
    attention_keys, attention_values, attention_score_function, attention_construct_function = tf.contrib.seq2seq.prepare_attention(attention_states, attention_option='bahdanau', num_units=decoder_cell.output_size)
    training_decoder_function = tf.contrib.seq2seq.attention_decoder_fn_train(encoder_state[0],
                                                                              attention_keys,
                                                                              attention_values, 
                                                                              attention_score_function,
                                                                              attention_construct_function,
                                                                              name= "attn_dec_train")
    decoder_output, decoder_final_state, decoder_final_context_state = tf.contrib.seq2seq.dynamic_rnn_decoder(decoder_cell,
                                                                                                              training_decoder_function,
                                                                                                              decoder_embedded_input,
                                                                                                              sequence_length,
                                                                                                              scope=decoding_scope)
    decoder_output_dropout = tf.nn.dropout(decoder_output, keep_prob)
    return output_function(decoder_output_dropout)

# Decoding the  Test/Validation Set:
def decode_test_set(encoder_state, decoder_cell, decoder_embeddings_matrix,sos_id,eso_id,maximum_length, num_words, decoding_scope, output_function, keep_prob, batch_size):
    attention_states = tf.zeros([batch_size, 1, decoder_cell.output_size])
    attention_keys, attention_values, attention_score_function, attention_construct_function = tf.contrib.seq2seq.prepare_attention(attention_states, attention_option='bahdanau', num_units=decoder_cell.output_size)
    test_decoder_function = tf.contrib.seq2seq.attention_decoder_fn_inference(output_function,
                                                                              encoder_state[0],
                                                                              attention_keys,
                                                                              attention_values, 
                                                                              attention_score_function,
                                                                              attention_construct_function,
                                                                              decoder_embeddings_matrix,
                                                                              sos_id,
                                                                              eso_id,
                                                                              maximum_length,
                                                                              num_words,
                                                                              name= "attn_dec_inf")
    test_predictions, decoder_final_state, decoder_final_context_state = tf.contrib.seq2seq.dynamic_rnn_decoder(decoder_cell,                                                                                                          test_decoder_function,                                                                                                                scope=decoding_scope)
    return test_predictions

# Creating the Decoder RNN:
def decoder_rnn(decoder_embedded_input, decoder_embeddings_matrix, encoder_state, num_words,sequence_length,rnn_size, num_layers, word2int, keep_prob, batch_size):
    with tf.variable_scope("decoding") as decoding_scope:
        lstm = tf.contrib.rnn.BasicLSTMCell(rnn_size)
        lstm_dropout = tf.contrib.rnn.DropoutWrapper(lstm, input_keep_prob=keep_prob)
        decoder_cell = tf.contrib.rnn.MultiRNNCell([lstm_dropout] * num_layers)
        weights = tf.truncated_normal_initializer(stddev= 0.1)
        biases = tf.zeros_initializer()
        output_function = lambda x : tf.contrib.layers.fully_connected(x,
                                                                       num_words,
                                                                       None,
                                                                       scope=decoding_scope,
                                                                       weights_initializer= weights,
                                                                       biases_initializer= biases)
        training_predictions = decode_training_set(encoder_state,
                                                   decoder_cell,
                                                   decoder_embedded_input,
                                                   sequence_length,
                                                   decoding_scope,
                                                   output_function,
                                                   keep_prob,
                                                   batch_size)
        decoding_scope.reuse_variables()
        test_predictions = decode_test_set(encoder_state,
                                           decoder_cell,
                                           decoder_embeddings_matrix,
                                           word2int['<SOS>'],
                                           word2int['<EOS>'],
                                           sequence_length - 1,
                                           num_words,
                                           decoding_scope,
                                           output_function, 
                                           keep_prob,
                                           batch_size)
        return training_predictions, test_predictions

# Building SEQ2SEQ Model:
def seq2seq_model(inputs, targets, keep_prob, batch_size, sequence_length, answers_num_words, questions_num_words, encoder_embedding_size, decoder_embedding_size, rnn_size, num_layers, questionswords2int):
    encoder_embedded_input = tf.contrib.layers.embed_sequence(inputs,
                                                              answers_num_words + 1,
                                                              encoder_embedding_size,
                                                              initializer=tf.random_uniform_initializer(0,1))
    encoder_state = encoder_rnn_layer(encoder_embedded_input,
                                      rnn_size,
                                      num_layers,
                                      keep_prob,
                                      sequence_length)
    preprocessed_targets = preprocess_targets(targets, questionsword2int, batch_size)
    decoder_embeddings_matrix = tf.Variable(tf.random_uniform([questions_num_words + 1, decoder_embedding_size], 0, 1))
    decoder_embedded_input = tf.nn.embedding_lookup(decoder_embeddings_matrix, preprocessed_targets)
    training_predictions, test_predictions = decoder_rnn(decoder_embedded_input,
                                                         decoder_embeddings_matrix,
                                                         encoder_state,
                                                         questions_num_words,
                                                         sequence_length,
                                                         rnn_size,
                                                         num_layers,
                                                         questionsword2int,
                                                         keep_prob,
                                                         batch_size)
    return training_predictions, test_predictions

这是训练:

# --- Training SEQ2SEQ Model---#
#------------------------------#

# Setting the Hyperparameters:

epochs = 100
batch_size = 64
rnn_size = 512
num_layers = 3
encoding_embedding_size = 512
decoding_embedding_size = 512
learning_rate = 0.01
min_learning_rate = 0.0001
learning_rate_decay = 0.9
keep_probability = 0.5

# Defining a Session:
tf.reset_default_graph()
session = tf.InteractiveSession()

# Loading Model Input Function:
inputs, targets, lr, keep_prob = model_inputs()

# Setting the Sequence Length:
sequence_length = tf.placeholder_with_default(25,None, name='sequence_length')

# Getting the Shape of on Input Tensors:
input_shape = tf.shape(inputs)

# Getting the Test and Training Predections:
traning_predictions, test_predictions = seq2seq_model(tf.reverse(inputs, [-1]), 
                                                                 targets,
                                                                 keep_prob,
                                                                 batch_size, 
                                                                 sequence_length,
                                                                 len(answersword2int),
                                                                 len(questionsword2int),
                                                                 encoding_embedding_size,
                                                                 decoding_embedding_size,
                                                                 rnn_size,
                                                                 num_layers,
                                                                 questionsword2int)

# Setting Up the Loss Error, The Optimizer and Gradient Clipping.
with tf.name_scope("optimization"):
    loss_error = tf.contrib.seq2seq.sequence_loss(traning_predictions,
                                                  targets,
                                                  tf.ones([input_shape[0], sequence_length]))
    optimizer = tf.train.AdamOptimizer(learning_rate)
    gradients = optimizer.compute_gradients(loss_error)
    clipped_gradients = [(tf.clip_by_value(grad_tensor, -5., 5.), grad_variable) for grad_tensor, grad_variable in gradients if grad_tensor is not None]
    optimizer_gradient_clipping = optimizer.apply_gradients(clipped_gradients)

# Padding the Sequences With the <PAD> Token:
def apply_padding(batch_of_sequences, word2int):
    max_sequence_length = max([len(sequence) for sequence in batch_of_sequences])
    return [sequence + [word2int['<PAD>']] * (max_sequence_length - len(sequence)) for sequence in batch_of_sequences]

# Splitting The Data Into Batches of Questions and Answers:
def split_into_batches (questions, answers, batch_size):
    for batch_index in range(0, len(questions) // batch_size):
        start_index = batch_index * batch_size
        questions_in_batch = questions[start_index: start_index + batch_size]
        answers_in_batch = answers[start_index: start_index + batch_size]
        padded_questions_in_batch = np.array(apply_padding(questions_in_batch, questionsword2int))
        padded_answers_in_batch = np.array(apply_padding(answers_in_batch, answersword2int))
        yield padded_questions_in_batch, padded_answers_in_batch

# Splitting the Questions and Answers into Training and Validation Set:
training_validation_split = int (len(sorted_clean_questions) * 0.15)
training_questions = sorted_clean_questions[training_validation_split:]
training_answers = sorted_clean_answers[training_validation_split:]
validation_questions = sorted_clean_questions[:training_validation_split]
validation_answers = sorted_clean_answers[:training_validation_split]

# Training:
batch_index_check_learning_loss = 100
batch_index_check_validation_loss = ((len(training_questions)) // batch_size // 2) - 1
total_training_loss_error = 0
list_validation_loss_error = []
early_stopping_check = 0
early_stopping_stop = 1000
checkpoint = 'chatbot_weights.ckpt'
session.run(tf.global_variables_initializer())

for epoch in range(1, epochs + 1):
    for batch_index, (padded_questions_in_batch, padded_answers_in_batch) in enumerate(split_into_batches(training_questions, training_answers, batch_size)):
        starting_time = time.time()
        _, batch_training_loss_error = session.run([optimizer_gradient_clipping, loss_error], {inputs: padded_questions_in_batch, targets: padded_answers_in_batch, lr: learning_rate, sequence_length: padded_answers_in_batch.shape[1], keep_prob: keep_probability})
        total_training_loss_error += batch_training_loss_error
        ending_time = time.time
        batch_time = ending_time - starting_time
        if batch_index % batch_index_check_learning_loss == 0:
            print('Epoch: {:>3}/{}, Batch: {:>4}/{}, Traing Loss Error: {:>6.3f}, Traing Time on 100 Batches: {:d} seconds'.format(epoch, batch_index,len(training_questions) // batch_size, total_training_loss_error / batch_index_check_learning_loss, int(batch_time * 100)))
            total_training_loss_error = 0
        if batch_index % batch_index_check_validation_loss == 0 and batch_index > 0:
            total_validation_loss_error = 0
            starting_time = time.time()
            for batch_index_validation, (padded_questions_in_batch, padded_answers_in_batch) in enumerate(split_into_batches(validation_questions, validation_answers, batch_size)):
                _, batch_validation_loss_error = session.run(loss_error, {inputs: padded_questions_in_batch, targets: padded_answers_in_batch, lr: learning_rate, sequence_length: padded_answers_in_batch.shape[1], keep_prob: 1})
                total_validation_loss_error += batch_validation_loss_error
            ending_time = time.time
            batch_time = ending_time - starting_time
            average_validation_loss_error = total_validation_loss_error / len(validation_questions) / batch_size
            print('Validation Loss Error: {:>6.3f}, Batch Validation Time: {:d} seconds'.format(average_validation_loss_error, int(batch_time)))
            learning_rate *= learning_rate_decay
            if learning_rate < min_learning_rate:
                learning_rate = min_learning_rate
            list_validation_loss_error.append(average_validation_loss_error)
            if average_validation_loss_error < min(list_validation_loss_error):
                print('I speak better now :)')
                early_stopping_check = 0
                saver = tf.train.Saver()
                saver.save(session, checkpoint)
            else:
                print('Sorry! I do not speak better, I need to practice more.')
                early_stopping_check += 1
                if early_stopping_check == early_stopping_stop:
                    break
    if early_stopping_check == early_stopping_stop:
        print('My apologies, I cannot speak better anymore, this is best I can do')
        break
print('Game over!')

如果您对此错误有解决方案，我们将不胜感激。 :)

最佳答案

在最后一层，例如您使用了 model.add(Dense(1,activation='softmax'))。这里1限制它的值从[0, 1)改变它的形状到最大输出标签。例如，您的输出来自标签 [0,7)，然后使用 model.add(Dense(7,activation='softmax'))

input_text = Input(shape=(max_len,), dtype=tf.string)
embedding = Lambda(ElmoEmbedding, output_shape=(max_len, 1024))(input_text)
x = Bidirectional(LSTM(units=512, return_sequences=True,
                   recurrent_dropout=0.2, dropout=0.2))(embedding)
x_rnn = Bidirectional(LSTM(units=512, return_sequences=True,
                       recurrent_dropout=0.2, dropout=0.2))(x)
x = add([x, x_rnn])  # residual connection to the first biLSTM
out = TimeDistributed(Dense(n_tags, activation="softmax"))(x)

这里，在 TimeDistributed 层中，n_tags 是我要分类的标签的长度。

如果我预测其他数量，例如 q_tag，其长度与 n_tags 不同，即假设 10 且 n_tags 的长度为 7，并且我收到 8 作为输出标签，它将给出无效参数错误 Received a label value of 8 which is Outside有效范围[0, 7)。

关于python - InvalidArgumentError : Received a label value of 8825 which is outside the valid range of [0, 8825) SEQ2SEQ 模型，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/52324322/

文章推荐： html - 你如何解释这个 div 间距问题？

文章推荐： c# - 非静态字段、方法或属性(数据集)需要对象引用

文章推荐： c# - 如何使用 C# 中的操作响应 URL 请求

iOS : How to align labels like whatsapp chat message label and time label?
在whatsapp中，如果消息很短，文本和时间在同一行。如果消息很长，时间在右下角 - 上面的文字。我如何在 Ios 中使用 Storyboard 实现此目的最佳答案尝试使用类似这样的方法来定义
html - CSS 选择器，它接受所有 label.control-label，除了带有类 .floating-labels 的表单
我有这段代码: label.control-label{ font-weight: bold; } label.control-label::after{ content: ":";
css - 将文本定位在中
尊敬的社区成员，我想将测试中的文本放在 div 的中心。代码如下所示: Testing everything: 现在，如果我尝试以下代码部分: Testing everything: 它不会在
javascript - 防止调整大小
我有一个 DIV 元素，它有一个并在其中输入文本框。基本上，我在 DIV 元素上启用了 jQuery .resizable()，但是当您使 DIV 元素小于当前大小时，文本框会被推到新的一行。我
accessibility - aria-label 和 label 不能同时读取
请考虑以下标记。 This is a label 对我来说，这个标记是在我的自定义工具提示控件之后生成的。我在 IE 上的 JAWS 上看到的问题是它只读取“标题，而不是标签”，但是对于其他屏幕阅读
label - ionic 2 : Fab button with label
我正在按照文档使用 ionic 2 构建应用程序。我已经实现了一个带有 fab-list 的 fab 按钮。我试图在包含按钮旁边放置一个描述性标签。开箱即用的 ionic 2 似乎无法在 float
javascript - 我可以使用 label 作为 label 标签吗？
通常我使用标签标签来指向这样的输入标签 First Name: 现在我有了这个 First Name: 由于我以前没有穿过这样的东西，是否可以为 label 添加 label 标签。当我应用 Ja
label - 瓦丁 : My label ignores the carriage return character
我有一个包含换行符(“\r”)的传入文本字符串。当我输出它时: System.out.println(myString) , 回车被解释。但是，当我将字符串设置为标签的内容时，它会忽略回车。如何
label - Libreoffice 计算器 : Custom x axis label
关闭。这个问题不满足Stack Overflow guidelines .它目前不接受答案。想改善这个问题吗？更新问题，使其成为 on-topic对于堆栈溢出。 1年前关闭。 Improve thi
Excel 2013 : Label deconfliction in labeled scatter plot
在 Excel 2013 中，我使用单元格中的值标记散点图。我希望标签不重叠。我可以手动移动标签，但我创建了一个过滤器来自动创建新绘图，因此我希望标签冲突也能自动发生。这可能吗？无需 VBA 的解决
jsp - Struts2 :label : Positions of label and value are inverted
在我的 Struts2 JSP 中，我想显示一个 id，所以我写道: A${id}B ( A 和 B 用于调试) 我希望它显示为 Id:A7B 但 HTML 中生成了以下内容:A7BId: 为什么标签
Haskell Labeled AST : No instance for (Show1 (Label a)), 如何构建实例？
我想要一个带注释的 AST，所以我定义了那些递归数据结构使用 Fix : data Term a = Abstraction Name a | Application a a | Var
java - Label.setScale 和 Label.setFontScale 之间的区别？
这两种方法都没有记录，并且似乎没有达到我的预期。 mylabel.setFontScale(3f); 使明显文本变大 3 倍(我正在寻找的)，但与 Align.center 一起使用时无法正确居中>.
ios - ScrollView -> View (Label + Label + TableView) 和自动布局
ScrollView里面有两个Label(多边的)，下面是TableView(其中行数可能不同) Label 和 TableView 的高度都没有设置。所有 outlet 都对彼此上方和下方的缩进设
HTML/CSS 标签 : Labels taking on the properties of other labels
我很好奇是否有一种简单的方法可以使标签采用 CSS 样式属性的默认值。我的复选框采用了我的选项卡的属性，我只希望它们成为默认值。正如您将看到的，我更改了复选框的字体大小，使其小于选项卡。但是，我不想仅
asp.net - asp :label and HTML label?有什么区别
asp:label 和 html label 有什么区别？我知道第一个是在服务器上呈现的，所以基本上它会返回一个跨度选项卡，但它有什么用呢？在什么情况下需要使用 HTML 标记，在什么情况下需要使用
python - "NotImplementedError: Use label() to access a node label"
我需要从网站中提取所有城市名称。我在以前的项目中使用了 beautifulSoup 和 RE，但在这个网站上，城市名称是常规文本的一部分，没有特定的格式。我找到了满足我要求的地理包 ( https:/
javascript - 有没有办法使用 Material Table React 向每个列标题添加
您好，我正在尝试添加到表格的每个单元格。我在这里使用 Material 表:https://material-table.com/#/docs/features/component-overridi
R 图形 : axis label placement relative to tick labels?
我想制作一个简单的 R 图，y 轴标签位于 y 轴刻度标签上方。我用下面的代码创建了我喜欢的东西。但是它需要对 at 进行一些摸索。图形参数。问:有没有更简单的方法来做到这一点？有没有办法查询 y
r - ggplot 抛出错误 `label not found` ，而 `label` 显然存在
我可以绘制以下 df 的标签使用 geom_text : df 1 8 var 2 426 -276 hours worked per week N

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - InvalidArgumentError : Received a label value of 8825 which is outside the valid range of [0, 8825) SEQ2SEQ 模型