准备数据如何在python中进行深度学习

问题描述

我已经完成了有关kaggle学习的深度学习课程,并且已经开始为MNIST Digit数据集编写模型。我喜欢理解所学的代码,并且遇到了这一点:

def data_prep(raw):
    out_y = keras.utils.to_categorical(raw.label,num_classes)

    num_images = raw.shape[0]
    x_as_array = raw.values[:,1:]
    x_shaped_array = x_as_array.reshape(num_images,img_rows,img_cols,1)
    out_x = x_shaped_array / 255
    return out_x,out_y

这部分确实让我感到困惑。我大部分都不了解。有人可以逐步解释每一行代码的作用吗?如果我要在具有多种颜色的彩色图像上执行此操作,这将如何工作? 我知道这有点宽泛。稍后,我将要做一些涉及彩色图像的操作,但是我不确定该怎么做,因为我可以看到黑白的“参数”(数组整形中的1,除以255)

旁注:raw是熊猫数据框

解决方法

在每行上方添加注释以说明其目的:

#input is a 2D dataframe of images
def data_prep(raw):
    #convert the classes in raw to a binary matrix
    #also known as one hot encoding and is typically done in ML
    out_y = keras.utils.to_categorical(raw.label,num_classes)

    #first dimension of raw is the number of images; each row in the df represents an image
    num_images = raw.shape[0]

    #remove the first column in each row which is likely a header and convert the rest into an array of values
    #ML algorithms usually do not take in a pandas dataframe 
    x_as_array = raw.values[:,1:]

    #reshape the images into 3 dimensional
    #1st dim: number of images
    #2nd dim: height of each image (i.e. rows when represented as an array)
    #3rd dim: width of each image (i.e. columns when represented as an array)
    #4th dim: the number of pixels which is 3 (RGB) for colored images and 1 for gray-scale images
    x_shaped_array = x_as_array.reshape(num_images,img_rows,img_cols,1)

    #this normalizes (i.e. 0-1) the image pixels since they range from 1-255. 
    out_x = x_shaped_array / 255

    return out_x,out_y

要处理彩色图像,数组中的第4维应为3,代表RGB values。请查看此tutorial,以获取有关CNN及其输入的更多详细信息。