如何将PyTorch的DataLoader与skorch的GridSearchCV一起使用

问题描述

我正在运行PyTorch ANN模型(用于分类任务),并且正在使用skorch的{​​{1}}搜索最佳超参数。

当我使用gridsearchcv运行gridsearchcv时(即一次执行一个超参数组合),它的运行速度非常慢。

当我将n_jobs=1设置为大于1时,出现内存溢出错误。因此,我现在尝试查看是否可以使用PyTorch的{​​{1}}将数据集分成多个批处理,以避免出现内存不足的问题。根据另一个PyTorch论坛问题(https://discuss.pytorch.org/t/how-to-use-skorch-for-data-that-does-not-fit-into-memory/70081/2),看来我们可以使用SliceDataset。我的代码如下:

n_jobs

但是,当我运行此代码时,出现以下错误消息:

DataLoader

我该如何解决?有没有办法将PyTorch的{​​{1}}与skorch的{​​{1}}一起使用(即,有没有一种方法可以将数据批量加载到skorch的{​​{1}}中,以避免我设置时出现内存不足的问题# Setting up artifical neural net model class TabularModel(nn.Module): # Initialize parameters embeds,emb_drop,bn_cont and layers def __init__(self,emb_szs,n_cont,out_sz,layers,p=0.5): super().__init__() self.embeds = nn.ModuleList([nn.Embedding(ni,nf) for ni,nf in emb_szs]) self.emb_drop = nn.Dropout(p) self.bn_cont = nn.Batchnorm1d(n_cont) # Create empty list for each layer in the neural net layerlist = [] # Number of all embedded columns for categorical features n_emb = sum((nf for ni,nf in emb_szs)) # Number of inputs for each layer n_in = n_emb + n_cont for i in layers: # Set the linear function for the weights and biases,wX + b layerlist.append(nn.Linear(n_in,i)) # Using ReLu activation function layerlist.append(nn.ReLU(inplace=True)) # normalised all the activation function output values layerlist.append(nn.Batchnorm1d(i)) # Set some of the normalised activation function output values to zero layerlist.append(nn.Dropout(p)) # Reassign number of inputs for the next layer n_in = i # Append last layer layerlist.append(nn.Linear(layers[-1],out_sz)) # Create sequential layers self.layers = nn.Sequential(*layerlist) # Function for Feedforward def forward(self,x_cat_cont): x_cat = x_cat_cont[:,0:cat_train.shape[1]].type(torch.int64) x_cont = x_cat_cont[:,cat_train.shape[1]:].type(torch.float32) # Create empty list for embedded categorical features embeddings = [] # Embed categorical features for i,e in enumerate(self.embeds): embeddings.append(e(x_cat[:,i])) # Concatenate embedded categorical features x = torch.cat(embeddings,1) # Apply dropout rates to categorical features x = self.emb_drop(x) # Batch normalize continuous features x_cont = self.bn_cont(x_cont) # Concatenate categorical and continuous features x = torch.cat([x,x_cont],1) # Feed categorical and continuous features into neural net layers x = self.layers(x) return x # Use cross entropy loss function since this is a classification problem # Assign class weights to the loss function criterion_skorch = nn.CrossEntropyLoss # Use Adam solver with learning rate 0.001 optimizer_skorch = torch.optim.Adam from skorch import NeuralNetClassifier # Random seed chosen to ensure results are reproducible by using the same initial random weights and biases,# and applying dropout rates to the same random embedded categorical features and neurons in the hidden layers torch.manual_seed(0) net = NeuralNetClassifier(module=TabularModel,module__emb_szs=emb_szs,module__n_cont=con_train.shape[1],module__out_sz=2,module__layers=[30],module__p=0.0,criterion=criterion_skorch,criterion__weight=cls_wgt,optimizer=optimizer_skorch,optimizer__lr=0.001,max_epochs=150,device='cuda' ) from sklearn.model_selection import gridsearchcv param_grid = {'module__layers': [[30],[50,20]],'module__p': [0.0],'max_epochs': [150,175] } from torch.utils.data import TensorDataset,DataLoader from skorch.helper import SliceDataset # cat_con_train and y_train is a PyTorch tensor tsr_ds = TensorDataset(cat_con_train.cpu(),y_train.cpu()) torch.manual_seed(0) # Set random seed for shuffling results to be reproducible d_loader = DataLoader(tsr_ds,batch_size=100000,shuffle=True) d_loader_slice_X = SliceDataset(d_loader,idx=0) d_loader_slice_y = SliceDataset(d_loader,idx=1) models = gridsearchcv(net,param_grid,scoring='roc_auc',n_jobs=2).fit(d_loader_slice_X,d_loader_slice_y) 大于--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-47-df3fc792ad5e> in <module>() 104 --> 105 models = gridsearchcv(net,d_loader_slice_y) 106 6 frames /usr/local/lib/python3.6/dist-packages/skorch/helper.py in __getitem__(self,i) 230 def __getitem__(self,i): 231 if isinstance(i,(int,np.integer)): --> 232 Xn = self.dataset[self.indices_[i]] 233 Xi = self._select_item(Xn) 234 return self.transform(Xi) TypeError: 'DataLoader' object does not support indexing 中的1)?

非常感谢!

解决方法

所以第一件事就是找出内存不足的地方。您的批处理量很大,大概只有一个GPU。如果您有多个GPU,则已经设置好了,可以按照these steps使用skorch + dask在多个GPU上并行进行网格搜索。

如果只有一个GPU,则GPU的RAM显然是瓶颈,它不支持RAM中模型的两个实例。您可以:

  • 减小模型尺寸(减少参数)
  • 减小批处理大小(数据占用更少的空间)

不过,您要走哪条路。