问题描述
我已经完成了有关kaggle学习的深度学习课程,并且已经开始为MNIST Digit数据集编写模型。我喜欢理解所学的代码,并且遇到了这一点:
def data_prep(raw):
out_y = keras.utils.to_categorical(raw.label,num_classes)
num_images = raw.shape[0]
x_as_array = raw.values[:,1:]
x_shaped_array = x_as_array.reshape(num_images,img_rows,img_cols,1)
out_x = x_shaped_array / 255
return out_x,out_y
这部分确实让我感到困惑。我大部分都不了解。有人可以逐步解释每一行代码的作用吗?如果我要在具有多种颜色的彩色图像上执行此操作,这将如何工作? 我知道这有点宽泛。稍后,我将要做一些涉及彩色图像的操作,但是我不确定该怎么做,因为我可以看到黑白的“参数”(数组整形中的1,除以255)
旁注:raw
是熊猫数据框
解决方法
在每行上方添加注释以说明其目的:
#input is a 2D dataframe of images
def data_prep(raw):
#convert the classes in raw to a binary matrix
#also known as one hot encoding and is typically done in ML
out_y = keras.utils.to_categorical(raw.label,num_classes)
#first dimension of raw is the number of images; each row in the df represents an image
num_images = raw.shape[0]
#remove the first column in each row which is likely a header and convert the rest into an array of values
#ML algorithms usually do not take in a pandas dataframe
x_as_array = raw.values[:,1:]
#reshape the images into 3 dimensional
#1st dim: number of images
#2nd dim: height of each image (i.e. rows when represented as an array)
#3rd dim: width of each image (i.e. columns when represented as an array)
#4th dim: the number of pixels which is 3 (RGB) for colored images and 1 for gray-scale images
x_shaped_array = x_as_array.reshape(num_images,img_rows,img_cols,1)
#this normalizes (i.e. 0-1) the image pixels since they range from 1-255.
out_x = x_shaped_array / 255
return out_x,out_y
要处理彩色图像,数组中的第4维应为3,代表RGB values。请查看此tutorial,以获取有关CNN及其输入的更多详细信息。