Datawig:使用 SimpleImputer 导致的 KeyError

问题描述

我正在尝试通过深度学习对数据集进行一些缺失值插补。我已经被这个问题困扰了很长时间,需要帮助。此行产生 KeyErrorimputer.fit(train_df = sampleDf,num_epochs = 50)。请注意,数据集中的训练集和测试集之间没有分离。下面是一个可重现的例子:

#!pip install datawig
import datawig

columns = np.arange(4).tolist()
print(len(columns))

d = {'col1': [1,2],'col2': [3,4],'col3': [5,6],'col4': [9,np.nan]}
sampleDf = pd.DataFrame(data = d)

#Iterate through each column of the dataset and impute that column. The resulting dataset will be used to impute the subsequent column etc. 

for i in range(len(columns)):
  #No need to impute the column if it has no null values.
  if (sampleDf[sampleDf.columns[i]].isnull().sum() == 0):
    continue

  inputColumns = columns[:i] + columns[i+1:]

  print([str(value) for value in inputColumns])
  print(str(i))

  #Initialize a SimpleImputer model
  imputer = datawig.SimpleImputer(
      input_columns= [str(value) for value in inputColumns],#Column(s) containing information about the column we want to impute
      output_column = str(i),#The column for which we would like to impute values
      output_path = 'imputerModel' #Stores model data and metrics
      )

  #Fit an imputer model on the train data
  imputer.fit(train_df = sampleDf,num_epochs = 50)

  #Impute missing values and return original dataframe with predictions
  imputedDatasetNN = imputer.predict(data_frame = sampleDf)

我尝试了此代码的许多变体,但所有版本都会导致相同的错误。如果可能,我将不胜感激。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)