问题描述
我有一个数据集,其中单独给出图像文件,并在单独的 csv 文件中给出该图像文件的标签,第一列作为图像文件名,第二列作为其各自的标签。 我的代码如下。
import pandas as pd
train= pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_train.csv',dtype=str)
train.head()
number;label
0 101.jpg;3
1 102.jpg;1
2 103.jpg;3
3 104.jpg;3
4 105.jpg;2
test = pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_test.csv',dtype=str)
test.head()
number;label
0 201.jpg;3
1 202.jpg;3
2 203.jpg;1
3 204.jpg;3
4 205.jpg;3
train_folder = '/content/drive/MyDrive/Colab_Notebooks/bilder_train'
test_folder = '/content/drive/MyDrive/Colab_Notebooks/bilder_test'
import os
import numpy as np
import glob
import shutil
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Activation,Conv2D,Flatten,Dropout,MaxPooling2D,Batchnormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras import regularizers,optimizers
train_gen = ImageDataGenerator(rescale=1./255,rotation_range=45,width_shift_range=.15,height_shift_range=.15,horizontal_flip=True,zoom_range=0.5)
test_gen = ImageDataGenerator(rescale=1./255)
train_data = train_gen.flow_from_dataframe(dataframe = train,directory = train_folder,x_col = 'number',y_col = 'label',seed = 42,batch_size = 10,shuffle = True,class_mode='categorical',target_size = (100,100))
test_data = test_gen.flow_from_dataframe(dataframe = test,directory = test_folder,y_col = None,shuffle = False,100))
这是错误信息
KeyError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self,key,method,tolerance)
2897 try:
-> 2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'number'
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
6 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self,tolerance)
2898 return self._engine.get_loc(casted_key)
2899 except KeyError as err:
-> 2900 raise KeyError(key) from err
2901
2902 if tolerance is not None:
KeyError: 'number'
我完全不知道为什么会出现这个错误。有人知道这里发生了什么吗?
解决方法
您需要在 sep=;
函数的末尾添加 pd.read_csv
(CSV 分隔符)。由于它的默认 sep
值为 ,
,所以它会将 number;label
解释为单个列而不是 2 个单独的列
import pandas as pd
train= pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_train.csv',dtype=str,sep=';')
train.head()
test = pd.read_csv('/content/drive/MyDrive/Colab_Notebooks/label_test.csv',sep=';')
test.head()