使用StratifiedKFold在BayesSearchCV中解包错误

问题描述

我正在尝试通过多类numpy.ndarray(命名为stratify组,形状为(n_samples,))对样本进行分层,该样本具有与X相同的n_samples(形状为(n_samples,n_features))。然后执行嵌套的交叉验证以搜索超参数。类似的代码在GridSearchCV和RandomizedSearchCV中效果很好,但在skopt.BayesSearchCV中不起作用。

对于GridSearchCV

PipeLine = Pipe_Lasso
Param_Grid =  {'lasso__alpha': np.arange(0.01,0.1,0.01)}

skf = StratifiedKFold(n_splits= 10,shuffle= True,random_state= 0)
skf_cv = StratifiedKFold(n_splits= 5,random_state= 0)

for train_index,test_index in skf.split(X,stratify_group):
    X_train = X[train_index]
    X_test = X[test_index]
    y_train = y[train_index]
    y_test = y[test_index]
    groups = stratify_group[train_index]

    gs = GridSearchCV(estimator= PipeLine,param_grid= [Param_Grid],cv = skf_cv.split(X = X_train,y=groups),scoring= Scoring)

    gs.fit(X_train,y_train)

运作良好。

但是尝试时

PipeLine = Pipe_Lasso
Param_Grid =  {'lasso__alpha': Real(0.001,10,prior='log-uniform')} 

skf = StratifiedKFold(n_splits= 10,stratify_group):
    X_train = X[train_index]
    X_test = X[test_index]
    y_train = y[train_index]
    y_test = y[test_index]
    groups = stratify_group[train_index]
    
    bs = BayesSearchCV(estimator= PipeLine,search_spaces= [Param_Grid],n_iter=32,cv = skf_cv.split(X_train,groups),scoring= Scoring)


    bs.fit(X_train,y_train)

引发错误

\Anaconda3\lib\site-packages\skopt\searchcv.py in fit(self,X,y,groups,callback)
    678                 optim_result = self._step(
    679                     X,search_space,optimizer,--> 680                     groups=groups,n_points=n_points_adjusted
    681                 )
    682                 n_iter -= n_points

~\Anaconda3\lib\site-packages\skopt\searchcv.py in _step(self,n_points)
    564         refit = self.refit
    565         self.refit = False
--> 566         self._fit(X,params_dict)
    567         self.refit = refit
    568 

~\Anaconda3\lib\site-packages\skopt\searchcv.py in _fit(self,parameter_iterable)
    421         else:
    422             (test_scores,test_sample_counts,--> 423              fit_time,score_time,parameters) = zip(*out)
    424 
    425         candidate_params = parameters[::n_splits]

ValueError: not enough values to unpack (expected 5,got 0)

如果我将cv设置为整数,则可以使用,但结果似乎表明 样本未分层。

或者如果我将cv设置为StratifiedKFold(n_splits= 5,random_state= 0),那么

ValueError: Supported target types are: ('binary','multiclass'). Got 'continuous' instead. 

出现。我猜想如果没有.split()方法,cv会将X_train与y_train分层,在我的情况下是连续数组。

我被困住了,在BayesSearchCV中找不到它不能工作的原因。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

相关问答

依赖报错 idea导入项目后依赖报错,解决方案:https://blog....
错误1:代码生成器依赖和mybatis依赖冲突 启动项目时报错如下...
错误1:gradle项目控制台输出为乱码 # 解决方案:https://bl...
错误还原:在查询的过程中,传入的workType为0时,该条件不起...
报错如下,gcc版本太低 ^ server.c:5346:31: 错误:‘struct...