gpt4 book ai didi

python - Tensorflow/Keras : Input 0 of layer lstm is incompatible with the layer: expected ndim=3, 发现 ndim=2

转载 作者:行者123 更新时间:2023-12-04 07:40:01 25 4
gpt4 key购买 nike

我正在尝试实现联合训练 Keras/Tensorflow 模型来检测文本文章中的假新闻,但我在使用该模型时遇到了问题。当我尝试运行代码时,出现以下错误:

 ValueError: Input 0 of layer lstm is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 50]
以及以下警告:
WARNING:tensorflow:Model was constructed with shape (None, 400) for input Tensor("embedding_input:0", shape=(None, 400), dtype=float32), but it was called on an input with incompatible shape (None,).
直觉上我明白嵌入层输出应该是形状 (None, 400, 50) 但由于某种原因,它只提供一个 2d 输入,或者该层需要一个 3d 张量,但只提供一个 2d 张量。但是,我不知道如何修复它,也不知道如何更改输入/输出形状以使它们匹配。我已经在这个问题上停留了几天。我在 ML 和神经网络领域还是新手。任何建议都值得感谢,非常感谢您提前。
使用的模型:
max_words = 2000
max_len = 400
embed_dim = 50
lstm_out = 64
batch_size = 32

def getTextModel():
model = Sequential()
model.add(Embedding(max_words, embed_dim, input_length = max_len, input_shape=preprocessed_sample_dataset.element_spec))
model.add(LSTM(lstm_out))
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1, name='out_layer'))
model.add(Activation('sigmoid'))
return model
型号概要:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 400, 50) 100000
_________________________________________________________________
lstm (LSTM) (None, 64) 29440
_________________________________________________________________
dense (Dense) (None, 256) 16640
_________________________________________________________________
activation (Activation) (None, 256) 0
_________________________________________________________________
dropout (Dropout) (None, 256) 0
_________________________________________________________________
out_layer (Dense) (None, 1) 257
_________________________________________________________________
activation_1 (Activation) (None, 1) 0
=================================================================
Total params: 146,337
Trainable params: 146,337
Non-trainable params: 0
其他信息:
数据预处理:
def preprocess(dataset):

def batch_format_fn(element):
"""Flatten a batch `pixels` and return the features as an `OrderedDict`."""
print(element['features'])
return collections.OrderedDict(
x=element['features'],
y=tf.reshape(element['label'], [-1, 1])
)
return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER).batch(
BATCH_SIZE).map(batch_format_fn).prefetch(PREFETCH_BUFFER)

preprocessed_sample_dataset = preprocess(sample_dataset)


def make_federated_data(client_data, client_ids):
return [preprocess(client_data.create_tf_dataset_for_client(x)) for x in client_ids]

federated_train_data = make_federated_data(train_dataset, train_dataset.client_ids)

print('Number of client datasets: {l}'.format(l=len(federated_train_data)))
print('First dataset: {d}'.format(d=federated_train_data[0]))
数据集格式:
Number of client datasets: 4
First dataset: <PrefetchDataset shapes: OrderedDict([(x, (None,)), (y, (None, 1))]), types: OrderedDict([(x, tf.string), (y, tf.int64)])>
调用函数的代码:
def model_fn():

keras_model = getTextModel() #create_keras_model()
input_spec_aux = preprocessed_sample_dataset.element_spec
return tff.learning.from_keras_model(
keras_model,
input_spec= input_spec_aux,
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

#Error occurs in iterative_process
iterative_process = tff.learning.build_federated_averaging_process(
model_fn,
client_optimizer_fn=lambda: tf.keras.optimizers.Adam(learning_rate=client_lr),
server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=server_lr))

print(str(iterative_process.initialize.type_signature))

state = iterative_process.initialize()

最佳答案

数据集格式表示输入的形状 x(None,) (ndim/rank, = 1) 和数据类型 tf.string) . None来自这样一个事实,即数据集可能会产生不“完整”的批次,因此实际上第一个维度的范围是 [1, BATCH_SIZE] .这个形状意味着我们有一批单标量字符串。这可能是问题所在,通常在 LSTM 中,我们需要一批字符串序列,例如形状像 (None, SEQUENCE_LENGTH) .
嵌入层会将最后一个维度投影到嵌入维度z ,例如成型(x, y)并产生形状 (x, y, z) .所以我们在嵌入层之后的输入将是 (None, 50) (或 ndim/rank = 2)。回想一下 LSTM 需要序列,而 Keras 需要批处理,错误消息说所需的形状是 (None, SEQUENCE_LENGTH, 50) (ndim/等级 = 3)。
我建议返回数据集并确定 element['features'] 的格式。是。似乎在这种情况下,它可能是一个完整的句子,需要被标记为一系列单词(例如,对于英语在空格上分割)。
不过有一句警告:即使在修复了形状之后,我怀疑 Keras 接下来会提示 tf.string 的 dtype |不能用于嵌入层。首先需要将序列转换为整数 id,可能使用来自 tf.lookup 的东西。或来自 tf_text 的东西.
一些可能有用的资源:

  • Federated Learning for Text Generation Tutorial ,特别是数据集构建部分。
  • Load text tutorial
  • 关于python - Tensorflow/Keras : Input 0 of layer lstm is incompatible with the layer: expected ndim=3, 发现 ndim=2,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67533039/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com