图像宽度和高度对转移学习模型准确性的影响

问题描述

我有近1000张4类1280x720像素图像的人在执行某些手势。 这个想法是使用转移学习。

下面是使用目标图像大小为640,360的Inceptioon的代码

from keras.applications.inception_v3 import InceptionV3,preprocess_input
from keras.models import Model
from keras.layers import Dense,GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import SGD
import os
path = 'E:/build/set_1/training'
# Get count of number of files in this folder and all subfolders
def get_num_files(path):
  if not os.path.exists(path):
    return 0
  return sum([len(files) for r,d,files in os.walk(path)])

# Get count of number of subfolders directly below the folder in path
def get_num_subfolders(path):
  if not os.path.exists(path):
    return 0
  return sum([len(d) for r,files in os.walk(path)])
print(get_num_files(path))
print(get_num_subfolders(path))
def create_img_generator():
  return  ImageDataGenerator(
      preprocessing_function=preprocess_input,rotation_range=30,width_shift_range=0.2,height_shift_range=0.2,shear_range=0.2,zoom_range=0.2,horizontal_flip=True
  )
Image_width,Image_height = 640,360
Training_Epochs = 7
Batch_Size = 32
Number_FC_Neurons = 1024

train_dir = 'Desktop/Dataset/training'
validate_dir = 'Desktop/Dataset/validation'
num_train_samples = get_num_files(train_dir) 
num_classes = get_num_subfolders(train_dir)
num_validate_samples = get_num_files(validate_dir)
num_epoch = Training_Epochs
batch_size = Batch_Size
train_image_gen = create_img_generator()
test_image_gen = create_img_generator()

#   Connect the image generator to a folder contains the source images the image generator alters.  
#   Training image generator
train_generator = train_image_gen.flow_from_directory(
  train_dir,target_size=(Image_width,Image_height),batch_size=batch_size,seed = 42    #set seed for reproducability
)
validation_generator = test_image_gen.flow_from_directory(
  validate_dir,seed=42       #set seed for reproducability
)
InceptionV3_base_model = InceptionV3(weights='imagenet',include_top=False) #include_top=False excludes final FC layer
print('Inception v3 base model without last FC loaded')
#print(InceptionV3_base_model.summary())     # display the Inception V3 model hierarchy

# Define the layers in the new classification prediction 
x = InceptionV3_base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(Number_FC_Neurons,activation='relu')(x)        # new FC layer,random init
predictions = Dense(num_classes,activation='softmax')(x)  # new softmax layer

# Define trainable model which links input from the Inception V3 base model to the new classification prediction layers
model = Model(inputs=InceptionV3_base_model.input,outputs=predictions)

# print model structure diagram
print (model.summary())
print ('\nPerforming Transfer Learning')
  #   Freeze all layers in the Inception V3 base model 
for layer in InceptionV3_base_model.layers:
  layer.trainable = False
#   Define model compile for basic Transfer Learning
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

# Fit the transfer learning model to the data from the generators.  
# By using generators we can ask continue to request sample images and the generators will pull images from 
# the training or validation folders and alter them slightly
history_transfer_learning = model.fit_generator(
  train_generator,epochs=num_epoch,steps_per_epoch = num_train_samples // batch_size,validation_data=validation_generator,validation_steps = num_validate_samples // batch_size)

# Save transfer learning model
model.save('inceptionv3-original-image-transfer-learning.model')

7个纪元的准确度为84%

如果目标图像尺寸为200,113,则7个时间段的准确度为86%

图像大小如何影响准确性,以及应使用什么图像大小来使该模型更准确。

解决方法

imagenet模型(无论使用何种框架)都在较小尺寸(224x224 ---> 299x299)上进行训练。

现在,实际上对于对象检测和图像分割,原则上您可以从高分辨率中受益,因为可以更好地检测较小的对象。还有一些特定的体系结构可以通过更智能的功能重用来解决此问题,但这是问题的重点。

可能是这样的,当您的网络正在训练较小的图像并且存在分类问题时,通过增加分辨率实际上并不能改善结果。实际上,对于手势问题,可能是由于以更高的分辨率获得的功能集/复杂性增加,网络学习到了“更难”的手势。

如果以较小的分辨率获得更好的结果,那不是问题;只需确保在测试集上/在现实生活中测试模型时,您需要保持图像的相同分布(现实生活中的图像需要与本地训练+ val +测试一样具有相同的统计分布)。

事实是,您需要遍历几种分辨率组合,然后检查哪种分辨率更适合您的情况;唯一要记住的是保持长宽比,避免引入伪影/失真。