我如何在 columnTransformer 中引入 SMOTE?

问题描述

我正在尝试在列转换器中实现 SMOTENC。但是我收到错误。下面提供了代码错误

#Create a mask for categorical features
categorical_feature_mask = X_train.dtypes == object
categorical_columns = X_train.columns[categorical_feature_mask].tolist()
print(categorical_columns)

from imblearn.over_sampling import SMOTENC

#Create two datasets also create a pipeline to automate the preprocessing steps
num_features= X_train.select_dtypes(include=[np.number]).columns
cat_features = X_train.select_dtypes(exclude=[np.number]).columns

cat_transformer = Pipeline(steps=[('imp_c',SimpleImputer(strategy='most_frequent')),('label_bina',LabelBinarizer())])
scale_transformer=Pipeline(steps=[('imp_m',SimpleImputer(strategy='median')),('std',StandardScaler())])
smote=SMOTENC(categorical_features=categorical_columns,random_state=99)
col_transform = ColumnTransformer(transformers=[
        ('num',scale_transformer,num_features),('cat',cat_transformer,cat_features),('smote',smote )],remainder='passthrough')
#We fit a DecisionTreeClassifier and evaluste the model performance
dt=DecisionTreeClassifier(random_state=99)
pl_dt=Pipeline(steps=[('transform',col_transform),('dt',dt)])
pl_dt.fit(X_train,np.ravel(y_train))

运行时出现错误:没有足够的值来解包(预期为 3,得到 2)。更准确


ValueError                                Traceback (most recent call last)
<ipython-input-34-a874d44f98ee> in <module>
      2 dt=DecisionTreeClassifier(random_state=99)
      3 pl_dt=Pipeline(steps=[('transform',dt)])
----> 4 pl_dt.fit(X_train,np.ravel(y_train))
      5 

~/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in fit(self,X,y,**fit_params)
    328         """
    329         fit_params_steps = self._check_fit_params(**fit_params)
--> 330         Xt = self._fit(X,**fit_params_steps)
    331         with _print_elapsed_time('Pipeline',332                                  self._log_message(len(self.steps) - 1)):

~/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in _fit(self,**fit_params_steps)
    294                 message_clsname='Pipeline',295                 message=self._log_message(step_idx),--> 296                 **fit_params_steps[name])
    297             # Replace the transformer of the step with the fitted
    298             # transformer. This is necessary when loading the transformer

~/anaconda3/lib/python3.7/site-packages/joblib/memory.py in __call__(self,*args,**kwargs)
    353 
    354     def __call__(self,**kwargs):
--> 355         return self.func(*args,**kwargs)
    356 
    357     def call_and_shelve(self,**kwargs):

~/anaconda3/lib/python3.7/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer,weight,message_clsname,message,**fit_params)
    738     with _print_elapsed_time(message_clsname,message):
    739         if hasattr(transformer,'fit_transform'):
--> 740             res = transformer.fit_transform(X,**fit_params)
    741         else:
    742             res = transformer.fit(X,**fit_params).transform(X)

~/anaconda3/lib/python3.7/site-packages/sklearn/compose/_column_transformer.py in fit_transform(self,y)
    525         # set n_features_in_ attribute
    526         self._check_n_features(X,reset=True)
--> 527         self._validate_transformers()
    528         self._validate_column_callables(X)
    529         self._validate_remainder(X)

~/anaconda3/lib/python3.7/site-packages/sklearn/compose/_column_transformer.py in _validate_transformers(self)
    274             return
    275 
--> 276         names,transformers,_ = zip(*self.transformers)
    277 
    278         # validate names

ValueError: not enough values to unpack (expected 3,got 2)

如何解决上述错误

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)