gpt4 book ai didi

TensorflowJs conv2d - 张量形状

转载 作者:行者123 更新时间:2023-11-30 09:03:46 25 4
gpt4 key购买 nike

我想为音频文件创建机器学习模型。我将音频文件转换为(频谱图)张量。我的特征张量(音频文件)具有以下形状 [119, 241, 125] (119 个文件,241 个样本/文件,125 个频率/样本)。通过样本,我定义了我在一段时间内采集的样本,例如16 毫秒。我的输出形状将是 [119, numOptions]

我关注了这个tutorial from Tensorflow.js关于音频识别。他们构建了这个模型:

Model

我将特征张量 reshape 为 4D:this.features = this.features.reshape([this.features.shape[0],this.features.shape[1],this.features.shape[2],1]) 2D 转换。

  buildModel() {
const inputShape1 = [this.features.shape[1], this.features.shape[2],this.features.shape[3]];
this.model = tfNode.sequential();
// filter to the image => feature extractor, edge detector, sharpener (depends on the models understanding)
this.model.add(tfNode.layers.conv2d(
{filters: 8, kernelSize: [4, 2], activation: 'relu', inputShape: inputShape1}
));

// see the image at a higher level, generalize it more, prevent overfit
this.model.add(tfNode.layers.maxPooling2d(
{poolSize: [2, 2], strides: [2, 2]}
));

// filter to the image => feature extractor, edge detector, sharpener (depends on the models understanding)
const inputShape2 = [119,62,8];
this.model.add(tfNode.layers.conv2d(
{filters: 32, kernelSize: [4, 2], activation: 'relu', inputShape: inputShape2}
));

// see the image at a higher level, generalize it more, prevent overfit
this.model.add(tfNode.layers.maxPooling2d(
{poolSize: [2, 2], strides: [2, 2]}
));

// filter to the image => feature extractor, edge detector, sharpener (depends on the models understanding)
const inputShape3 = [58,30,32];
this.model.add(tfNode.layers.conv2d(
{filters: 32, kernelSize: [4, 2], activation: 'relu', inputShape: inputShape3}
));

// see the image at a higher level, generalize it more, prevent overfit
this.model.add(tfNode.layers.maxPooling2d(
{poolSize: [2, 2], strides: [2, 2]}
));

// 1D output, => final output score of labels
this.model.add(tfNode.layers.flatten({}));

// prevents overfitting, randomly set 0
this.model.add(tfNode.layers.dropout({rate: 0.25}));

// learn anything linear, non linear comb. from conv. and soft pool
this.model.add(tfNode.layers.dense({units: 2000, activation: 'relu'}));

this.model.add(tfNode.layers.dropout({rate: 0.25}));

// give probability for each label
this.model.add(tfNode.layers.dense({units: this.labels.shape[1], activation: 'softmax'}));

this.model.summary();

// compile the model
this.model.compile({loss: 'meanSquaredError', optimizer: 'adam'});
this.model.summary()
};

模型摘要:

_________________________________________________________________
Layer (type) Output shape Param #
=================================================================
conv2d_Conv2D1 (Conv2D) [null,238,124,8] 72
_________________________________________________________________
max_pooling2d_MaxPooling2D1 [null,119,62,8] 0
_________________________________________________________________
conv2d_Conv2D2 (Conv2D) [null,116,61,32] 2080
_________________________________________________________________
max_pooling2d_MaxPooling2D2 [null,58,30,32] 0
_________________________________________________________________
conv2d_Conv2D3 (Conv2D) [null,55,29,32] 8224
_________________________________________________________________
max_pooling2d_MaxPooling2D3 [null,27,14,32] 0
_________________________________________________________________
flatten_Flatten1 (Flatten) [null,12096] 0
_________________________________________________________________
dropout_Dropout1 (Dropout) [null,12096] 0
_________________________________________________________________
dense_Dense1 (Dense) [null,2000] 24194000
_________________________________________________________________
dropout_Dropout2 (Dropout) [null,2000] 0
_________________________________________________________________
dense_Dense2 (Dense) [null,2] 4002
=================================================================
Total params: 24208378
Trainable params: 24208378
Non-trainable params: 0
_________________________________________________________________
Epoch 1 / 10
eta=0.0 ======================================>----------------------------------------------------------------------------- loss=0.515 0.51476
eta=0.8 ============================================================================>--------------------------------------- loss=0.442 0.44186
eta=0.0 ===================================================================================================================>
3449ms 32236us/step - loss=0.485 val_loss=0.958
Epoch 2 / 10
eta=0.0 ======================================>----------------------------------------------------------------------------- loss=0.422 0.42188
eta=0.9 ============================================================================>--------------------------------------- loss=0.395 0.39535
eta=0.0 ===================================================================================================================>
3643ms 34043us/step - loss=0.411 val_loss=0.958
Epoch 3 / 10

1)第一个输入大小是我的特征张量形状。另外两个 inputShapes (inputShape2, inputShape3) 由我收到的错误消息定义。如何提前确定以下两个输入大小?

最佳答案

inputShape是如何计算的?

计算的不是 inputShape。传递给模型的数据集必须与 inputShape 相匹配。定义模型时,inputShape 是 3D 的。但查看模型摘要,有一个值为 null 的第四个维度,即批量形状。因此,训练数据应该是 4D 的。第一个维度或批处理形状可以是任何东西 - 重要的是特征和标签具有相同的批处理形状。有更详细的答案here

图层形状是如何计算的?

这取决于所使用的层。 dropoutactivation 等图层不会更改输入形状。

  • 根据步幅内核,卷积层将改变输入形状。这个answer详细说明了如何计算。

  • 展平层只会将 inputShape reshape 为一维。在模型摘要中,输入形状为 [null,27,14,32],展平层的形状为 [null, 12096] (12096 = 27 * 14 *32)

  • 密集层也会改变输入形状。致密层的形状取决于该层的单元数量。

关于TensorflowJs conv2d - 张量形状,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58075896/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com