machine-learning - 检查输入 : expected embedding_1_input to have shape (4, 时出错，但得到形状为 (1,) 的数组-6ren

machine-learning - 检查输入 : expected embedding_1_input to have shape (4, 时出错，但得到形状为 (1,) 的数组

转载作者：行者123 更新时间：2023-11-30 09:16:40

27

4

我为我的 keras 模型使用预训练的嵌入向量。在我这样做之前，一切正常，现在我收到此错误:

ValueError: Error when checking input: expected embedding_1_input to have shape (4,) but got array with shape (1,)

也许有人可以帮助我，我在这里做错了什么。我不确定我的 model.fit 和 model.evaluate 是否正确。也许有问题？

import csv
import numpy as np
np.random.seed(42)
from keras.models import Sequential, Model
from keras.layers import *
from random import shuffle
from sklearn.model_selection import train_test_split
from keras import optimizers
from keras.callbacks import EarlyStopping
from itertools import groupby
from numpy import asarray
from numpy import zeros 
from numpy import array
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

#function makes a list of antonyms and synonyms from the files
def preprocessing(filename):
    list_words = []
    with open(filename) as tsv:
       for line in csv.reader(tsv, dialect="excel-tab"):
           list_words.append([line[0], line[1]])
    return list_words

#function make a list of not relevant pairs by mixing synonyms and 
antonyms
def notrelevant(filename, filename2):
    list_words = []
    with open(filename) as tsv:
        with open(filename2) as tsv2:
           for lines in zip(csv.reader(tsv, dialect="excel-tab"),csv.reader(tsv2, dialect="excel-tab")):
                list_words.append([lines[0][0], lines[1][1]])
    return list_words

antonyms_list = preprocessing("antonyms.tsv")
synonyms_list = preprocessing("synonyms.tsv")
notrelevant_list = notrelevant("antonyms.tsv", "synonyms.tsv")

# function combines all antonyms, synonyms in one list with labels, 
shuffle them
def data_prepare(ant,syn,nrel):
        data = []
    for  elem1,elem2 in ant:
        data.append([[elem1,elem2], "Antonyms"])
    for elem1, elem2 in syn:
        data.append([[elem1, elem2], "Synonyms"])
    for elem1, elem2 in nrel:
        data.append([[elem1, elem2], "Not relevant"])
    shuffle(data)
    return data


data_with_labels_shuffled = 
data_prepare(antonyms_list,synonyms_list,notrelevant_list)

def label_to_onehot(labels):
    mapping = {label: i for i, label in enumerate(set(labels))}

    one_hot = np.empty((len(labels), 3))
    for i, label in enumerate(labels):
        entry = [0] * len(mapping)
        entry[mapping[label]] = 1
        one_hot[i] = entry
    return (one_hot)

def words_to_ids(labels):
    vocabulary = []
    word_to_id = {}
    ids = []
    for word1,word2 in labels:
        vocabulary.append(word1)
        vocabulary.append(word2)
    counter = 0
    for word in vocabulary:
        if word not in word_to_id:
            word_to_id[word] = counter
            counter += 1
    for word1,word2 in labels:
        ids.append([word_to_id [word1], word_to_id [word2]])
    return (ids)

def split_data(datas):
    data = np.array(datas)
    X, y = data[:, 0], data[:, 1]
    # split the data to get 60% train and 40% test
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42)
    y_train = label_to_onehot(y_train)
    X_dev, X_test, y_dev, y_test = train_test_split(X_test, y_test, test_size=0.5, random_state=42)
    y_dev = label_to_onehot(y_dev)
    y_test = label_to_onehot(y_test)
    return X_train, y_train, X_dev, y_dev, X_test, y_test

X_train, y_train, X_dev, y_dev, X_test, y_test = split_data(data_with_labels_shuffled)

# prepare tokenizer
t = Tokenizer()
t.fit_on_texts(X_train)
vocab_size = len(t.word_index) + 1
# integer encode the documents
encoded_docs = t.texts_to_sequences(X_train)


# load the whole embedding into memory
embeddings_index = dict()
f = open('glove.6B.50d.txt')
for line in f:
    values = line.split()
    word = values[0]
    coefs = asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()
# create a weight matrix for words in training docs
embedding_matrix = zeros((vocab_size, 50))
for word, i in t.word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        embedding_matrix[i] = embedding_vector



VOCABSIZE = len(data_with_labels_shuffled)
EMBSIZE = 50
HIDDENSIZE = 50
KERNELSIZE = 5
MAXEPOCHS = 5

model = Sequential()
model.add(Embedding(vocab_size, 50, weights=[embedding_matrix], 
input_length=4, trainable=False))
model.add(Dropout(0.25))
model.add(Bidirectional(GRU(units = HIDDENSIZE // 2)))
#model.add(Flatten())
model.add(Dense(units = 3, activation = "softmax"))
model.compile(loss='categorical_crossentropy', optimizer="adam", 
metrics=['accuracy'])


earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=2, verbose=0, mode='min') 
model.fit (X_train, y_train,
       batch_size=64,
       callbacks = [earlystop],
       epochs=100,
       validation_data=(X_dev, y_dev),
       verbose=1)
scores = model.evaluate(X_test, y_testbatch_size=64)

print("Accuracy is: %.2f%%" %(scores[1] * 100))

最佳答案

我认为问题是您应该将encoded_docs传递给model.fit()函数而不是X_train，因为encoded_docs包含训练数据的标记，而X_train仍然只包含单词列表。此外，您必须确保嵌入层的 input_length 参数与您在encoded_docs 中创建的这些标记化训练示例的长度相匹配。

关于machine-learning - 检查输入 : expected embedding_1_input to have shape (4, 时出错，但得到形状为 (1,) 的数组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54332745/

27

4

0

文章推荐： python - 如何分析sklearn-pipeline的中间步骤？

文章推荐： java - 不会继续在 ACTION_DOWN 上执行语句

文章推荐： java - StringTokenizer - 读取带有整数的行

c++ - 架构 x86_64 的 undefined symbol : "Shape::get_area()", 从 : votable for shape in shape. o 引用
您好，我很确定我的问题很愚蠢，但我无法弄清楚它对我的生活有何影响。我有这个家庭作业，它基本上是为了加强我们在类里面学到的关于多态性的知识(顺便说一下，这是 C++)。该程序的基础是一个名为 shape
python - 引发 ValueError ("bad input shape {0}".format(shape)) ValueError : bad input shape (10, 90)
我是新手，所以需要任何帮助，当我要求一个例子时，我的教授给我了这段代码，我希望有一个工作模型...... from numpy import loadtxt import numpy as np fr
CSS shape-margin、shape-outside 不起作用
CSS 形状边距和外型不适用于我的系统。我正在使用最新版本的 Chrome。我唯一能想到的是我的操作系统是 Windows 7。这应该是一个问题吗？这是JSFiddle .但是，由于在您的系统上
基于tf.shape(tensor)和tensor.shape()的区别说明
#tf.shape(tensor)和tensor.shape()的区别 ?
excel - 如何在excel vba中使用 "for each shape in activesheet.shapes"将形状插入指定单元格
我要求提示以下问题。如何从事件表添加到指定的单元格形状？当我知道名称但不知道如何为...中的每个形状实现论坛时，我可以添加形状目前我有这样的事情: Sub loop() Dim a As Integ
VBA 获取连接器 'from shape' 和 'to shape'
我在 Excel 中有一个流程设计(使用形状、连接器等)。我需要的是有一个矩阵，每个形状都有所有的前辈和所有的后继者。在 VBA 中，为此我正在尝试执行以下操作: - 我列出了所有的连接器(Sha
java - 如何在 JavaFX 场景图中拖动低于另一个 `Shape` 的 `shape`？
我正在使用 JavaFX 编写一个教育应用程序，用户可以在其中绘制和操作贝塞尔曲线 Line、QuadCurve 和 CubicCurve。这些曲线应该能够用鼠标拖动。我有两种选择: 1- 使用类 L
python - matplotlib 历史() : weights should have the same shape as x while shape is the same
我正在尝试绘制 pandas 系列中列的直方图 ('df_plot')。因为我希望 y 轴是百分比(而不是计数)，所以我使用权重选项来实现这一点。正如您在下面的堆栈跟踪中发现的那样，权重数组和数据系列
python - OpenCV:无法创建类型为 "flatten_1/Shape"的层 "Shape"
我尝试在 opencv dnn 中实现一个 tensorflow 模型。这是我遇到的错误: OpenCV: Can't create layer "flatten_1/Shape" of type "
JavaFX Canvas : Draw a shape exclusively within another shape
我目前正在用 Java 开发一款游戏，我一直在尝试弄清楚如何在 Canvas 上绘制一个形状(例如圆形)，在不同的形状(例如正方形)之上，但是只绘制与正方形相交的圆的部分，类似于 Photoshop
python - 对于范围(defects.shape)中的i:AttributeError: 'NoneType'对象没有属性 'shape'
import cv2 import numpy as np import sys import time import os cap = cv2.VideoCa
python - 检查输入时出错 : expected embedding_1 input to have shape but got shape
我已经成功创建了 Keras 序列模型并对其进行了一段时间的训练。现在我试图做出一些预测，但即使使用与训练阶段相同的数据，它也会失败。我收到此错误:{ValueError}检查输入时出错:预期 em
python - .shape[] 在 "for i in range(Y.shape[0])"中做了什么？
我正在尝试逐行分解程序。 Y 是一个数据矩阵，但我找不到任何关于 .shape[0] 究竟做了什么的具体数据。 for i in range(Y.shape[0]): if Y[i] == -
opencv - 行，列，_ = frame.shape AttributeError: 'tuple'对象没有属性 'shape'
我正在尝试运行代码，但它给了我这个错误: 行，列，_ = frame.shape AttributeError:“tuple”对象没有属性“shape” 我正在使用OpenCV和python 3.6，
java - 将 Shape 从 awt 转换为 javafx 中的 Shape
我想在 JavaFx 中的 Pane 上显示形状。我正在使用从空间数据库中选择的 Oracle JGeometry 对象，它有一个方法 createShape() 但它返回 java.awt.Shap
python - 值错误: could not broadcast input array from shape (5) into shape (7)
在此代码中: import pandas as pd myj='{"columns":["tablename","alias_tablename","real_tablename","
python - 将函数应用于小数据帧 : shape mismatch: value array of shape (4, ) 无法广播
我正在尝试将 API 结果应用于两列。下面是我的虚拟数据框。不幸的是，这不是很容易重现，因为我使用的是带有 key 和密码的 API...这只是为了让您了解尺寸。但我希望也许有人能发现一个明显的问
java - JSONVIEW.with 使用 @JsonFormat(shape = JsonFormat.Shape.OBJECT) 将枚举序列化为字符串
我的代码是: final String json = getObjectMapper().writeValueAsString(JsonView.with(graph) .onClas
python - 索引错误 : shape mismatch: indexing arrays could not be broadcast together with shapes
a=np.arange(240).reshape(3,4,20) b=np.arange(12).reshape(3,4) c=np.zeros((3,4),dtype=int) x=np.arang
python - 索引错误 : shape mismatch: indexing arrays could not be broadcast together with shapes
我正在尝试从张量中提取某些数据，但出现了奇怪的错误。在这里，我将尝试生成错误: a=np.random.randn(5, 10, 5, 5) a[:, [1, 6], np.triu_indices(

首页

博学

6Ren·AI

商城

machine-learning - 检查输入 : expected embedding_1_input to have shape (4, 时出错，但得到形状为 (1,) 的数组