如何在类尤其是use_named_args装饰器中使用scikit-learn优化?

问题描述

我正在使用scikit-learn优化包来调整模型的超参数。出于性能和可读性的原因(我正在使用相同的过程来训练多个模型),我想在一个类中构造整个超参数调整:

...
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.preprocessing import sequence
from tensorflow.keras.models import Sequential,load_model
from tensorflow.keras.layers import InputLayer,Input,Dense,Embedding,Batchnormalization,Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import TensorBoard,EarlyStopping
from sklearn.preprocessing import MinMaxScaler,OneHotEncoder
from sklearn.model_selection import train_test_split

import skopt
from skopt import gp_minimize
from skopt.space import Real,Categorical,Integer
from skopt.plots import plot_convergence
from skopt.plots import plot_objective,plot_evaluations
from skopt.utils import use_named_args

class hptuning:
   def __init__(self,input_df):
         self.inp_df = input_df
         self.X_train,self.X_test,self.y_train,self.y_test = train_test_split(...)
         self.param_space = self.dim_hptuning()
         self.best_loss = 10000

   def dim_hptuning(self):
         dim_layers = Integer(low=0,high=7,name='layers')
         dim_nodes = Integer(low=2,high=90,name='num_nodes')
         dimensions = [dim_layers,dim_nodes]
         return dimensions

   def create_model(self,layers,nodes):
         model = Sequential()
         for layer in range(layers):
             model.add(Dense(nodes))
         model.add(Dense(1,activation='sigmoid'))
         optimizer = Adam
         model.compile(loss='mean_absolute_error',optimizer=optimizer,metrics=['mae','mse'])
         return model
         
   @use_named_args(dimensions=self.param_space)
   def fitness(self,nodes,layers):
         model = self.create_model(layers=layers,nodes=nodes)
         history = model.fit(x=self.X_train.values,y=self.y_train.values,epochs=200,batch_size=200,verbose=0)
         loss = history.history['val_loss'][-1]
         if loss < self.best_loss:
             model.save('model.h5')
             self.best_loss = loss
         del model
         K.clear_session()
         return loss

   def find_best_model(self):
         search_result = gp.minimize(func=self.fitness,dimensions=self.param_space,acq_func='EI',n_calls=10)
         return search_result
hptun = hptuning(input_df=df)
search_result = hptun.find_best_model()
print(search_result.fun)

现在我遇到这样的问题,装饰器@use_named_args在类中不起作用,因为他应该是(example code of scikit-optimize).我收到错误消息

Traceback (most recent call last):
File "main.py",line 138,in <module>
class hptuning:
File "main.py",line 220,in hptuning
@use_named_args(dimensions=self.param_space)
NameError: name 'self' is not defined

这显然是在这种情况下滥用装饰器的原因。

可能是由于我对这种装饰器的功能缺乏了解,所以我无法使其运行。有人可以帮我吗?

提前谢谢大家的支持

解决方法

self未被定义的问题与scikit.learn无关。您不能使用self来定义装饰器,因为它仅在您装饰的方法内部定义。但是,即使您回避了这个问题(例如,通过提供param_space作为全局变量),我也希望下一个问题是self将传递给use_named_args装饰器,但是它期望仅对参数进行优化

最明显的解决方案是不在fitness方法上使用装饰器,而是在fitness方法内部定义一个装饰函数,该函数调用find_best_model方法,如下所示:

   def find_best_model(self):
         @use_named_args(dimensions=self.param_space)
         def fitness_wrapper(*args,**kwargs):
             return self.fitness(*args,**kwargs)
         search_result = gp.minimize(func=fitness_wrapper,dimensions=self.param_space,acq_func='EI',n_calls=10)
         return search_result