根据类别将不同的数据扩充应用于部分火车集

问题描述

我正在研究机器学习过程以对图像进行分类。我的问题是我的数据集不平衡，在我的5个图像类别中，我一类中有大约400张图像，而其他类别中的每个中大约有20张图像。

我想通过仅对火车组的某些类别应用数据增强来平衡火车组。

这是我用来创建验证集火车的代码：

# Import data
data_dir = pathlib.Path(r"C:\Train set")

# Define train and validation sets (80% - 20%)
batch_size = 32
img_height = 240
img_width = 240

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,validation_split=0.2,subset="training",seed=123,image_size=(img_height,img_width),batch_size=batch_size)

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,subset="validation",batch_size=batch_size)

这是我应用数据增强的方法，尽管这适用于整个火车：

# Apply data augmentation
data_augmentation = keras.Sequential(
  [
    layers.experimental.preprocessing.RandomFlip("horizontal",input_shape=(img_height,img_width,3)),layers.experimental.preprocessing.Randomrotation(0.1),layers.experimental.preprocessing.RandomZoom(0.1),]
)

有什么方法可以进入我的火车，提取那些图像较少的类别，并仅对它们应用数据增强？

谢谢！

解决方法

我建议不要使用ImageDataGenerator，而是使用自定义的tf.data.Dataset。在映射操作中，您可以对类别进行不同的处理，例如：

def preprocess(filepath):
    category = tf.strings.split(filepath,os.sep)[0]
    read_file = tf.io.read_file(filepath)
    decode = tf.image.decode_jpeg(read_file,channels=3)
    resize = tf.image.resize(decode,(200,200))
    image = tf.expand_dims(resize,0)
    if tf.equal(category,'tf_astronauts'):
        image = tf.image.flip_up_down(image)
        image = tf.image.flip_left_right(image)
    # image = tf.image.convert_image_dtype(image,tf.float32)
    # category = tf.cast(tf.equal(category,'tf_astronauts'),tf.int32)
    return image,category

让我演示一下。让我们为您提供一个包含训练图像的文件夹：

import tensorflow as tf
import matplotlib.pyplot as plt
import cv2
from skimage import data
from glob2 import glob
import os

cat = data.chelsea()
astronaut = data.astronaut()

for category,picture in zip(['tf_cats','tf_astronauts'],[cat,astronaut]):
    os.makedirs(category,exist_ok=True)
    for i in range(5):
        cv2.imwrite(os.path.join(category,category + f'_{i}.jpg'),cv2.cvtColor(picture,cv2.COLOR_RGB2BGR))

files = glob('tf_*\\*.jpg')

现在您拥有以下文件：

['tf_astronauts\\tf_astronauts_0.jpg','tf_astronauts\\tf_astronauts_1.jpg','tf_astronauts\\tf_astronauts_2.jpg','tf_astronauts\\tf_astronauts_3.jpg','tf_astronauts\\tf_astronauts_4.jpg','tf_cats\\tf_cats_0.jpg','tf_cats\\tf_cats_1.jpg','tf_cats\\tf_cats_2.jpg','tf_cats\\tf_cats_3.jpg','tf_cats\\tf_cats_4.jpg']

让我们仅将转换应用于宇航员类别。让我们使用tf.image转换。

def preprocess(filepath):
    category = tf.strings.split(filepath,category

然后，我们制作tf.data.Dataset：

train = tf.data.Dataset.from_tensor_slices(files).\
    shuffle(10).take(4).map(preprocess).batch(4)

当您迭代数据集时，您会看到只有宇航员被翻转了：

fig = plt.figure()
plt.subplots_adjust(wspace=.1,hspace=.2)
images,labels = next(iter(train))
for index,(image,label) in enumerate(zip(images,labels)):
    ax = plt.subplot(2,2,index + 1)
    ax.set_xticks([])
    ax.set_yticks([])
    ax.set_title(label.numpy().decode())
    ax.imshow(image[0].numpy().astype(int))
plt.show()

请注意，为了进行培训，您需要取消注释preprocess中的两行，以便它返回一个浮点数数组和一个整数。