为什么我的模型在第二个时期过度拟合？

问题描述

我是深度学习的初学者，我正在尝试训练深度学习模型，以使用Mobilenet_v2和Inception对不同的ASL手势进行分类。

这是我的代码，用于创建ImageDataGenerator来创建训练和验证集。

# Reformat Images and Create Batches

IMAGE_RES = 224
BATCH_SIZE = 32

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=1./255,validation_split = 0.4
)

train_generator = datagen.flow_from_directory(
    base_dir,target_size = (IMAGE_RES,IMAGE_RES),batch_size = BATCH_SIZE,subset = 'training'
)

val_generator = datagen.flow_from_directory(
    base_dir,target_size= (IMAGE_RES,subset = 'validation'
)

以下是训练模型的代码：

# Do transfer learning with Tensorflow Hub
URL = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"
feature_extractor = hub.KerasLayer(URL,input_shape=(IMAGE_RES,IMAGE_RES,3))
# Freeze pre-trained model
feature_extractor.trainable = False

# Attach a classification head
model = tf.keras.Sequential([
  feature_extractor,layers.Dense(5,activation='softmax')
])

model.summary()

# Train the model
model.compile(
  optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

EPOCHS = 5

history = model.fit(train_generator,steps_per_epoch=len(train_generator),epochs=EPOCHS,validation_data = val_generator,validation_steps=len(val_generator)
                    )

第1/5集 94/94 [==============================]-19s 199ms / step-损耗：0.7333-精度：0.7730-val_loss ：0.6276-val_accuracy：0.7705

第2/5集 94/94 [==============================]-18s 190ms / step-损耗：0.1574-精度：0.9893-val_loss ：0.5118-val_accuracy：0.8145

第3/5集 94/94 [==============================]-18s 191ms / step-损耗：0.0783-精度：0.9980-val_loss ：0.4850-val_accuracy：0.8235

第4/5集 94/94 [==============================]-18s 196ms / step-损耗：0.0492-精度：0.9997-val_loss ：0.4541-val_accuracy：0.8395

史诗5/5 94/94 [==============================]-18s 193ms / step-损耗：0.0349-精度：0.9997-val_loss ：0.4590-val_accuracy：0.8365

我尝试使用数据增强，但是模型仍然过拟合，所以我想知道我的代码是否做错了什么。

解决方法

您的数据非常小。尝试使用随机种子进行拆分，然后检查问题是否仍然存在。

如果这样做，则使用正则化并降低神经网络的复杂性。

还尝试使用不同的优化器和较小的学习率（尝试lr计划程序）

deep-learning image-classification python tensorflow transfer-learning