gpt4 book ai didi

python - 尝试在 google colab 上使用 tensorflow 时无法识别图像文件

转载 作者:行者123 更新时间:2023-12-04 17:25:50 30 4
gpt4 key购买 nike

我正在使用 google colaboratory 和 tensorflow 来训练神经网络来对狗和猫的图像进行分类。我在哪里使用 model.fit_generator 来训练我的数据。我的数据加载正常,但是当它在一些时期读取图像后开始迭代验证步骤时,我收到标题中描述的以下错误:

PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f347160a0f8>

我用的猫狗图片是从kaggle上下载的

我已经在 jupyter notebook 上看到一些针对单个图像使用 PIL 的解决方案,但鉴于 google collab 隐式使用 PIL,我将如何处理 google collab 上每个图像的此错误?

这是我的代码实例

from google.colab import files
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import numpy as np
from keras.preprocessing import image

from google.colab import drive
drive.mount('/content/drive')

img_width, img_height = 150, 150

train_data_dir = '/content/drive/My Drive/data/train'
validation_data_dir = '/content/drive/My Drive/data/validation'
nb_train_samples = 1000
nb_validation_Samples = 100
epochs = 10
batch_size = 20


if K.image_data_format() == 'channels_first':
input_shape = (3, img_width, img_height)
else:
input_shape = (img_width, img_height, 3)

train_datagen = ImageDataGenerator(
rescale= 1. / 255,
shear_range = 0.2,
zoom_range=0.2,
horizontal_flip=True
)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='binary')

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.summary()


model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))


model.summary()

model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])

model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data = validation_generator,
validation_steps = nb_validation_Samples // batch_size)


validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode="binary")

错误本身发生在这一点上:

   model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data = validation_generator,
validation_steps = nb_validation_Samples // batch_size)

具体来说,在这一行中:

validation_steps = nb_validation_Samples // batch_size)

最佳答案

如果下载数据集from microsoft ,您可以使用下面的脚本来清理它。如评论所示,该脚本主要是从另一个 SO 主题中采用的。

#!/usr/bin/env python
# https://stackoverflow.com/questions/63754311/unidentifiedimageerror-cannot-identify-image-file
# 1st in the answers

import os
from PIL import Image

folder_path = r'raw\PetImages'
extensions = []
for fldr in os.listdir(folder_path):
sub_folder_path = os.path.join(folder_path, fldr)
for filee in os.listdir(sub_folder_path):
file_path = os.path.join(sub_folder_path, filee)
print('** Path: {} **'.format(file_path), end="\r", flush=True)
try:
im = Image.open(file_path)
rgb_im = im.convert('RGB')
if filee.split('.')[1] not in extensions:
extensions.append(filee.split('.')[1])
except:
print("\nWrong format file: ", file_path, flush=True)

print("\nValid extensions: ", repr(extensions))

'''
** Path: raw\PetImages\Cat\666.jpg **
Wrong format file: raw\PetImages\Cat\666.jpg
** Path: raw\PetImages\Cat\Thumbs.db **
Wrong format file: raw\PetImages\Cat\Thumbs.db
** Path: raw\PetImages\Dog\11702.jpg **
Wrong format file: raw\PetImages\Dog\11702.jpg
** Path: raw\PetImages\Dog\9057.jpg **D:\penv38\lib\site-packages\PIL\TiffImagePlugin.py:811: UserWarning: Truncated File Read
warnings.warn(str(msg))
** Path: raw\PetImages\Dog\Thumbs.db **
Wrong format file: raw\PetImages\Dog\Thumbs.db

Valid extensions: ['jpg']

Thus exclude these files:
Cat\666.jpg
Dog\11702.jpg
Dog\9057.jpg
'''

关于python - 尝试在 google colab 上使用 tensorflow 时无法识别图像文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63425209/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com