gpt4 book ai didi

python - 了解 Keras 层的形状

转载 作者:行者123 更新时间:2023-11-30 08:28:49 25 4
gpt4 key购买 nike

我将通过此链接来了解用于文本分类的多 channel CNN 模型。

代码基于this tutorial.

我已经理解了大部分内容,但是我无法理解 Keras 如何定义某些层的输出形状。

这是代码:

定义一个具有三个输入 channel 的模型,用于处理 4 克、6 克和 8 克的电影评论文本。

#Skipped keras imports

# load a clean dataset
def load_dataset(filename):
return load(open(filename, 'rb'))

# fit a tokenizer
def create_tokenizer(lines):
tokenizer = Tokenizer()
tokenizer.fit_on_texts(lines)
return tokenizer

# calculate the maximum document length
def max_length(lines):
return max([len(s.split()) for s in lines])

# encode a list of lines
def encode_text(tokenizer, lines, length):
# integer encode
encoded = tokenizer.texts_to_sequences(lines)
# pad encoded sequences
padded = pad_sequences(encoded, maxlen=length, padding='post')
return padded

# define the model
def define_model(length, vocab_size):
# channel 1
inputs1 = Input(shape=(length,))
embedding1 = Embedding(vocab_size, 100)(inputs1)
conv1 = Conv1D(filters=32, kernel_size=4, activation='relu')(embedding1)
drop1 = Dropout(0.5)(conv1)
pool1 = MaxPooling1D(pool_size=2)(drop1)
flat1 = Flatten()(pool1)
# channel 2
inputs2 = Input(shape=(length,))
embedding2 = Embedding(vocab_size, 100)(inputs2)
conv2 = Conv1D(filters=32, kernel_size=6, activation='relu')(embedding2)
drop2 = Dropout(0.5)(conv2)
pool2 = MaxPooling1D(pool_size=2)(drop2)
flat2 = Flatten()(pool2)
# channel 3
inputs3 = Input(shape=(length,))
embedding3 = Embedding(vocab_size, 100)(inputs3)
conv3 = Conv1D(filters=32, kernel_size=8, activation='relu')(embedding3)
drop3 = Dropout(0.5)(conv3)
pool3 = MaxPooling1D(pool_size=2)(drop3)
flat3 = Flatten()(pool3)
# merge
merged = concatenate([flat1, flat2, flat3])
# interpretation
dense1 = Dense(10, activation='relu')(merged)
outputs = Dense(1, activation='sigmoid')(dense1)
model = Model(inputs=[inputs1, inputs2, inputs3], outputs=outputs)
# compile
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# summarize
print(model.summary())
plot_model(model, show_shapes=True, to_file='multichannel.png')
return model

# load training dataset
trainLines, trainLabels = load_dataset('train.pkl')
# create tokenizer
tokenizer = create_tokenizer(trainLines)
# calculate max document length
length = max_length(trainLines)
# calculate vocabulary size
vocab_size = len(tokenizer.word_index) + 1
print('Max document length: %d' % length)
print('Vocabulary size: %d' % vocab_size)
# encode data
trainX = encode_text(tokenizer, trainLines, length)
print(trainX.shape)

# define model
model = define_model(length, vocab_size)
# fit model
model.fit([trainX,trainX,trainX], array(trainLabels), epochs=10, batch_size=16)
# save the model
model.save('model.h5')

运行代码:

运行该示例首先会打印准备好的训练数据集的摘要。最大文档长度:1380词汇量:44277(1800、1380)

____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
input_1 (InputLayer) (None, 1380) 0
____________________________________________________________________________________________________
input_2 (InputLayer) (None, 1380) 0
____________________________________________________________________________________________________
input_3 (InputLayer) (None, 1380) 0
____________________________________________________________________________________________________
embedding_1 (Embedding) (None, 1380, 100) 4427700 input_1[0][0]
____________________________________________________________________________________________________
embedding_2 (Embedding) (None, 1380, 100) 4427700 input_2[0][0]
____________________________________________________________________________________________________
embedding_3 (Embedding) (None, 1380, 100) 4427700 input_3[0][0]
____________________________________________________________________________________________________
conv1d_1 (Conv1D) (None, 1377, 32) 12832 embedding_1[0][0]
____________________________________________________________________________________________________
conv1d_2 (Conv1D) (None, 1375, 32) 19232 embedding_2[0][0]
____________________________________________________________________________________________________
conv1d_3 (Conv1D) (None, 1373, 32) 25632 embedding_3[0][0]
____________________________________________________________________________________________________
dropout_1 (Dropout) (None, 1377, 32) 0 conv1d_1[0][0]
____________________________________________________________________________________________________
dropout_2 (Dropout) (None, 1375, 32) 0 conv1d_2[0][0]
____________________________________________________________________________________________________
dropout_3 (Dropout) (None, 1373, 32) 0 conv1d_3[0][0]
____________________________________________________________________________________________________
max_pooling1d_1 (MaxPooling1D) (None, 688, 32) 0 dropout_1[0][0]
____________________________________________________________________________________________________
max_pooling1d_2 (MaxPooling1D) (None, 687, 32) 0 dropout_2[0][0]
____________________________________________________________________________________________________
max_pooling1d_3 (MaxPooling1D) (None, 686, 32) 0 dropout_3[0][0]
____________________________________________________________________________________________________
flatten_1 (Flatten) (None, 22016) 0 max_pooling1d_1[0][0]
____________________________________________________________________________________________________
flatten_2 (Flatten) (None, 21984) 0 max_pooling1d_2[0][0]
____________________________________________________________________________________________________
flatten_3 (Flatten) (None, 21952) 0 max_pooling1d_3[0][0]
____________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 65952) 0 flatten_1[0][0]
flatten_2[0][0]
flatten_3[0][0]
____________________________________________________________________________________________________
dense_1 (Dense) (None, 10) 659530 concatenate_1[0][0]
____________________________________________________________________________________________________
dense_2 (Dense) (None, 1) 11 dense_1[0][0]
====================================================================================================
Total params: 14,000,337
Trainable params: 14,000,337
Non-trainable params: 0
____________________________________________________________________________________________________

还有

Epoch 6/10
1800/1800 [==============================] - 30s - loss: 9.9093e-04 - acc: 1.0000
Epoch 7/10
1800/1800 [==============================] - 29s - loss: 5.1899e-04 - acc: 1.0000
Epoch 8/10
1800/1800 [==============================] - 28s - loss: 3.7958e-04 - acc: 1.0000
Epoch 9/10
1800/1800 [==============================] - 29s - loss: 3.0534e-04 - acc: 1.0000
Epoch 10/10
1800/1800 [==============================] - 29s - loss: 2.6234e-04 - acc: 1.0000

我对Layer和输出形状的解释如下:请帮助我理解它是否正确,因为我迷失在多维度中。

input_1 (InputLayer) (None, 1380) : ---> 1380 是每个数据点的特征总数(即 1380 个输入神经元)。 1800 是文档或数据点的总数。

embedding_1 (Embedding) (None, 1380, 100) 4427700 ----> 嵌入层为:1380个特征(单词),每个特征都是维度为100的向量。

这里的参数个数怎么是4427700??

conv1d_1 (Conv1D) (None, 1377, 32) 12832 ------> Conv1d 的内核大小=4。是使用32次的1*4过滤器吗?那么维度如何变成带有 12832 参数的 (None, 1377, 32) 呢?

ma​​x_pooling1d_1 (MaxPooling1D) (None, 688, 32) 和 MaxPooling1D(pool_size=2) 维度如何变成(None, 688, 32)flatten_1(Flatten)(无,22016)这只是 688、32 的乘法?

** 每个 epoch 是否同时训练 1800 个数据点?**

请告诉我输出尺寸是如何计算的。任何引用或帮助将不胜感激。

最佳答案

请参阅以下答案:

input_1 (InputLayer) (None, 1380) : ---> 1380 is the total number of features ( that is 1380 input neurons) per data point. 1800 is the total number of documents or data points.

是的。 model.fit([trainX,trainX,trainX], array(trainLabels), epochs=10, batch_size=16) 表示您希望网络在整个训练数据集的批量大小为 16。

这意味着,每 16 个数据点,将启动反向传播算法并更新权重。这将发生 1800/16 次,称为一个纪元。

1380是第一层神经元的数量。

embedding_1 (Embedding) (None, 1380, 100) | 4427700 ----> Embedding layer is : 1380 as features(words) and each feature is a vector of dimension 100.

1380 是输入的大小(前一层神经元的数量),100 是嵌入向量的大小(长度)。

此处的参数数量为 vocabulary_size * 100,因为对于词汇表中的每个 v,您需要训练 100 个参数。嵌入层实际上是一个由大小为 100 的vocabulary_size 向量构建的矩阵,其中每一行代表词汇表中每个单词的向量表示。

conv1d_1 (Conv1D) (None, 1377, 32) | 12832 ------> Conv1d is of kernel size=4. Is it 1*4 filter which is used 32 times. Then how the dimension became (None, 1377, 32) with 12832 parameters?

由于内核的大小,1380 变成了 1377。想象一下以下输入(为了简化,大小为 10),内核大小为 4:

0123456789 #input
KKKK456789
0KKKK56789
12KKKK6789
123KKKK789
1234KKKK89
12345KKKK9
123456KKKK

看,内核无法进一步向右移动,因此对于输入大小 10 和内核大小 4,输出形状将为 7。一般来说,对于 n 的输入形状和 k 的内核形状,输出形状将为 n - k + 1,因此对于 n=1380, k=4 的结果是1377

参数数量等于 12832,因为参数数量等于 output_channels * (input_channels * window_size + 1)。在您的情况下,它是 32*(100*4 + 1)

max_pooling1d_1 (MaxPooling1D) (None, 688, 32) with MaxPooling1D(pool_size=2) how the dimension became (None, 688, 32)?

max_pooling 获取每两个连续的数字,并将它们替换为其中的最大值,因此您最终会得到 original_size/pool_size 值。

flatten_1 (Flatten) (None, 22016) This is just multiplication of 688, 32?`

是的,这只是 688 和 32 的乘法。这是因为,展平操作执行以下操作:

1234
5678 -> 123456789012
9012

因此它获取所有维度的所有值并将其放入一维向量中。

Does every epoch trains 1800 data points at once?

没有。正如第一个答案中指出的那样,它以 16 个为一批。每个时期以随机顺序获取 1800 个数据点,每批 16 个数据点。纪元是一个术语,意思是一段时间,之后我们将再次开始读取数据。

编辑:

我将澄清一维卷积层应用于嵌入层的地方。

嵌入层的输出应解释为宽度为 1380、 channel 数为 100 的向量。

与输入具有三个 channel 的 RGB 图像类似的 2d 图像,当您应用由 32 个滤波器构建的卷积层(滤波器大小无关)时,其形状为 (width, height, 3),卷积运算同时应用于所有 channel ,输出形状将为 (new_width, new_height, 32)。请注意,输出形状与滤波器的数量相同。

回到你的例子。将嵌入层的输出形状视为(宽度, channel )。因此,将具有 32 个滤波器且内核大小等于 4 的 1d 卷积层应用于向量 1380 和深度 100。结果,您将得到形状 (1377, 32) 的输出。

关于python - 了解 Keras 层的形状,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58832191/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com