gpt4 book ai didi

python - CNN 中的模型精度和损失没有改善

转载 作者:行者123 更新时间:2023-12-05 09:10:24 35 4
gpt4 key购买 nike

我正在使用下面的 LeNet 架构来训练我的图像分类模型,我注意到每次迭代都不会提高训练和验证的准确性。这方面的任何专家都可以解释可能出了什么问题吗?

训练样本 - 属于 2 个类别的 110 张图像。验证 - 属于 2 个类的 50 张图像。

#LeNet

import keras
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

#import dropout class if needed
from keras.layers import Dropout

from keras import regularizers

model = Sequential()
#Layer 1
#Conv Layer 1
model.add(Conv2D(filters = 6,
kernel_size = 5,
strides = 1,
activation = 'relu',
input_shape = (32,32,3)))
#Pooling layer 1
model.add(MaxPooling2D(pool_size = 2, strides = 2))
#Layer 2
#Conv Layer 2
model.add(Conv2D(filters = 16,
kernel_size = 5,
strides = 1,
activation = 'relu',
input_shape = (14,14,6)))
#Pooling Layer 2
model.add(MaxPooling2D(pool_size = 2, strides = 2))
#Flatten
model.add(Flatten())
#Layer 3
#Fully connected layer 1
model.add(Dense(units=128,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))
#Layer 4
#Fully connected layer 2
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))

#layer 5
#Fully connected layer 3
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))

#layer 6
#Fully connected layer 4
model.add(Dense(units=64,activation='relu',kernel_initializer='uniform'
,kernel_regularizer=regularizers.l2(0.01)))
model.add(Dropout(rate=0.2))

#Layer 7
#Output Layer
model.add(Dense(units = 2, activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

from keras.preprocessing.image import ImageDataGenerator

#Image Augmentation
train_datagen = ImageDataGenerator(
rescale=1./255, #rescaling pixel value bw 0 and 1
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)

#Just Feature scaling
test_datagen = ImageDataGenerator(rescale=1./255)

training_set = train_datagen.flow_from_directory(
'/Dataset/Skin_cancer/training',
target_size=(32, 32),
batch_size=32,
class_mode='categorical')

test_set = test_datagen.flow_from_directory(
'/Dataset/Skin_cancer/testing',
target_size=(32, 32),
batch_size=32,
class_mode='categorical')

model.fit_generator(
training_set,
steps_per_epoch=50, #number of input (image)
epochs=25,
validation_data=test_set,
validation_steps=10) # number of training sample

Epoch 1/25
50/50 [==============================] - 52s 1s/step - loss: 0.8568 - accuracy: 0.4963 - val_loss: 0.7004 - val_accuracy: 0.5000
Epoch 2/25
50/50 [==============================] - 50s 1s/step - loss: 0.6940 - accuracy: 0.5000 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/25
50/50 [==============================] - 48s 967ms/step - loss: 0.6932 - accuracy: 0.5065 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/25
50/50 [==============================] - 50s 1s/step - loss: 0.6932 - accuracy: 0.4824 - val_loss: 0.6933 - val_accuracy: 0.5000
Epoch 5/25
50/50 [==============================] - 49s 974ms/step - loss: 0.6932 - accuracy: 0.4949 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 6/25
50/50 [==============================] - 51s 1s/step - loss: 0.6932 - accuracy: 0.4854 - val_loss: 0.6931 - val_accuracy: 0.5000
Epoch 7/25
50/50 [==============================] - 49s 976ms/step - loss: 0.6931 - accuracy: 0.5015 - val_loss: 0.6918 - val_accuracy: 0.5000
Epoch 8/25
50/50 [==============================] - 51s 1s/step - loss: 0.6932 - accuracy: 0.4986 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 9/25
50/50 [==============================] - 49s 973ms/step - loss: 0.6932 - accuracy: 0.5000 - val_loss: 0.6929 - val_accuracy: 0.5000
Epoch 10/25
50/50 [==============================] - 50s 1s/step - loss: 0.6931 - accuracy: 0.5044 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 11/25
50/50 [==============================] - 49s 976ms/step - loss: 0.6931 - accuracy: 0.5022 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 12/25

Accuracy on each iteration

loss for each iteration

最佳答案

最重要的是,您正在使用 loss = 'categorical_crossentropy',将其更改为 loss = 'binary_crossentropy',因为您只有 2 个类。并且还将 flow_from_directory 中的 class_mode='categorical' 更改为 class_mode='binary'

正如@desertnaut 正确提到的那样,categorical_crossentropy 在最后一层与 softmax 激活密切相关,如果您将损失更改为 binary_crossentropy 最后的激活也应该改为sigmoid

其他改进:

  1. 您的数据非常有限(160 张图像),并且您已将近 50% 的数据用作验证数据。
  2. 在构建图像分类模型时,您只有两个 Conv2D 层和 4 个密集层。密集层增加了大量需要学习的权重。添加更多的 conv2d 层并减少 Dense 层。
  3. 设置 batch_size = 1 并删除 steps_per_epoch。由于您的输入非常少,所以让每个时期的步数与输入记录的步数相同。
  4. 使用默认的 glorot_uniform 内核初始化器。
  5. 要进一步调整您的模型,请使用多个 Conv2D 层构建模型,然后使用 GlobalAveragePooling2D 层和 FC 层以及最终的 softmax 层。
  6. 使用数据增强技术,如 horizo​​ntal_flipvertical_flipshear_range、ImageDataGenerator 的 zoom_range 来增加训练和验证图像的数量。<

按照@desertnaut 的建议将评论移至答案部分-

Question - Thanks ! Yes , less data is the problem I figured . One additional question - why is that adding more dense layer than conv layer negatively affecting the model, is there any rule to follow when we decide how many conv and dense layer we gonna use ? – Arun_Ramji_Shanmugam 2 days ago

Answer - To answer the first part of your question, Conv2D layer maintains the spatial information of the image and weights to be learnt depend on the kernel size and stride mentioned in the layer,where as the Dense layer needs the output of Conv2D to be flattened and used further hence losing the spatial information. Also dense layer adds more number of weights, for example 2 dense layers of 512 adds (512*512)=262144 params or weights to the model(has to be learnt by the model).That means you have to train for more number of epochs and with good hype parameters settings for learning of these weights. – Tensorflow Warriors 2 days ago

Answer - To answer the second part of your question,use systematic experiments to discover what works best for your specific dataset. Also it depends on processing power you hold. Remember, deeper networks is always better, at the cost of more data and increased complexity of learning. A conventional approach is to look for similar problems and deep learning architectures which have already been shown to work. Also we have the flexibility to utilize the pretrained models like resnet, vgg etc, use these models by freezing the part of the layers and training on remaining layers. – Tensorflow Warriors 2 days ago

Question - Thank you for detailed answer !! If you don't bother one more question - so when we are using already trained model (may be some layers) , isn't it required to be trained on same input data as the one we gonna work ? – Arun_Ramji_Shanmugam yesterday

Answer - The intuition behind transfer learning for image classification is that if a model is trained on a large and general enough dataset, this model will effectively serve as a generic model of the visual world. You can find transfer learning example with explanation here - tensorflow.org/tutorials/images/transfer_learning . – Tensorflow Warriors yesterday

关于python - CNN 中的模型精度和损失没有改善,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61498304/

35 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com