问题描述
我很困惑哪个度量gridsearchcv在其参数搜索使用。我的理解是,我的模型对象喂它一个指标,这就是被用来确定“best_params”。但是,这似乎并不如此。我认为,得分=无是默认的,如在model.compile的度量选项给定的第一度量()的结果而使用。所以在我的情况下,打分函数使用应该是mean_squred_error。接下来,说明我对这个问题的解释。
下面是我在做什么。我10万周的观察使用sklearn 10个功能模拟一些回归的数据。我玩弄keras因为我通常使用pytorch在过去从来没有真正与keras涉足到现在。我从我的gridsearchcv通话VS的model.fit()调用注意到的损失函数输出的差异后,我有我的参数设置优化。现在我知道我可以改装= True,并且不会再改装模式,但我想感受一下的keras和sklearn gridsearchcv功能的输出。
要更明确一些这里的差异是我所看到的。我模拟使用sklearn一些数据如下:
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X,y = make_regression(n_samples=N,n_features=feats,n_@R_976_4045@ive=2,noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
我已经创建了一个“create_model”功能,正在寻求调其中我使用的激活函数(同样,这是用于概念证明一个简单的例子)。
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30,input_dim=feats,activation=activation_fn,kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10,activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1,activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['mean_squared_error','mae'])
return model
model = KerasRegressor(build_fn=create_model,epochs=50,batch_size=200,verbose=0)
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = gridsearchcv(estimator=model,param_grid=param_grid,n_jobs=1,cv=3)
grid_result = grid.fit(X_train,y_train,verbose=1)
print("Best: %f using %s" % (grid_result.best_score_,grid_result.best_params_))
Best: -21.163454 using {'activation_fn': 'linear'}
好的,所以最好的指标是 21.16 的均方误差(我知道他们翻转符号以创建最大化问题)。所以,当我使用activation_fn拟合模型=“线性”的MSE我得到完全不同。
best_model = create_model('linear')
history = best_model.fit(X_train,verbose=1)
.....
.....
Epoch 49/50
8000/8000 [==============================] - 0s 48us/step - loss: 344.1636 - mean_squared_error: 344.1636 - mean_absolute_error: 12.2109
Epoch 50/50
8000/8000 [==============================] - 0s 48us/step - loss: 326.4524 - mean_squared_error: 326.4524 - mean_absolute_error: 11.9250
history.history['mean_squared_error']
Out[723]:
[10053.778002929688,9826.66806640625,......
......
344.16363830566405,326.45237121582034]
区别在于326.45对比21.16。任何见解,以什么我误解将不胜感激。如果他们互相给予一个合理邻域内是错误从一折VS整个训练数据集,我会更舒服。但是 21 远不及 326。谢谢!
import pandas as pd
import numpy as np
from keras import Sequential
from keras.layers import Dense,Dropout,Activation,Flatten
from keras.layers import Convolution2D,MaxPooling2D
from keras.utils import np_utils
from sklearn.model_selection import gridsearchcv
from keras.wrappers.scikit_learn import KerasClassifier,KerasRegressor
from keras.constraints import maxnorm
from sklearn import preprocessing
from sklearn.preprocessing import scale
from sklearn.datasets import make_regression
from matplotlib import pyplot as plt
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X,noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30,'mae'])
return model
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
# create model
model = KerasRegressor(build_fn=create_model,verbose=0)
# define the grid search parameters
activations = ['linear',verbose=1)
best_model = create_model('linear')
history = best_model.fit(X_train,verbose=1)
history.history.keys()
plt.plot(history.history['mean_absolute_error'])
# summarize results
grid_result.cv_results_
print("Best: %f using %s" % (grid_result.best_score_,grid_result.best_params_))
解决方法
输出中报告的大损失 (326.45237121582034) 是训练损失。如果您需要将指标与 grid_result.best_score_
(在 GridSearchCV
中)和 MSE(在 best_model.fit
中)进行比较,您必须请求验证损失(参见下面的代码)。
现在问题来了:为什么验证损失低于训练损失?在您的情况下,这主要是因为 dropout(在训练期间应用,但不在验证/测试期间应用) - 这就是为什么当您删除 dropout 时,训练和验证损失之间的差异消失的原因。您可以找到有关验证损失较低的可能原因的详细说明here。
简而言之,您的模型的性能 (MSE) 由 grid_result.best_score_
(在您的示例中为 21.163454)给出。
import numpy as np
from keras import Sequential
from keras.layers import Dense,Dropout
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.datasets import make_regression
import tensorflow as tf
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)
tf.random.set_seed(42)
# Setting some data basics
N = 10000
feats = 10
# generate regression dataset
X,y = make_regression(n_samples=N,n_features=feats,n_informative=2,noise=3)
# training data and testing data #
X_train = X[:int(N * 0.8)]
y_train = y[:int(N * 0.8)]
X_test = X[int(N * 0.8):]
y_test = y[int(N * 0.8):]
def create_model(activation_fn):
# create model
model = Sequential()
model.add(Dense(30,input_dim=feats,activation=activation_fn,kernel_initializer='normal'))
model.add(Dropout(0.2))
model.add(Dense(10,activation=activation_fn))
model.add(Dropout(0.2))
model.add(Dense(1,activation='linear'))
# Compile model
model.compile(loss='mean_squared_error',optimizer='adam',metrics=['mean_squared_error','mae'])
return model
# create model
model = KerasRegressor(build_fn=create_model,epochs=50,batch_size=200,verbose=0)
# define the grid search parameters
activations = ['linear','relu']
param_grid = dict(activation_fn = activations)
grid = GridSearchCV(estimator=model,param_grid=param_grid,n_jobs=1,cv=3)
grid_result = grid.fit(X_train,y_train,verbose=1,validation_data=(X_test,y_test))
best_model = create_model('linear')
history = best_model.fit(X_train,y_test))
history.history.keys()
# plt.plot(history.history['mae'])
# summarize results
print(grid_result.cv_results_)
print("Best: %f using %s" % (grid_result.best_score_,grid_result.best_params_))