问题描述
我有一个预测太阳能发电量的数据集,该数据集包含20个独立变量和1个相关变量。我的模型的精度停留在60%。我已经尝试了几种模型,但是这种精度是最好的,我可以得到更多的效果。 这是我的代码:
data_path = r'drive/My Drive/Proj/S.P.F./solarpowergeneration.csv'
dts = pd.read_csv('solarpowergeneration.csv')
dts.head()
X = dts.iloc[:,:-1].values
y = dts.iloc[:,-1].values
print(X.shape,y.shape)
y = np.reshape(y,(-1,1))
y.shape
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.25,random_state=42)
from sklearn.preprocessing import StandardScaler
sc= StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
y_train = sc.fit_transform(y_train)
y_test = sc.transform(y_test)
import keras.backend as K
def calc_accu(y_true,y_pred):
return K.mean(K.equal(K.round(y_true),K.round(y_pred)))
def get_spfnet():
ann = tf.keras.models.Sequential()
ann.add(Dense(X_train.shape[1],activation='relu'))
# ann.add(Batchnormalization())
ann.add(Dropout(0.3))
ann.add(Dense(32,activation='relu',kernel_regularizer=regularizers.l2(0.01)))
# ann.add(Batchnormalization())
ann.add(Dropout(0.3))
ann.add(Dense(32,kernel_regularizer=regularizers.l2(0.01)))
# ann.add(Batchnormalization())
ann.add(Dropout(0.3))
ann.add(Dense(1))
ann.compile(loss='mse',optimizer='adam',metrics=[tf.keras.metrics.RootMeanSquaredError(),calc_accu])
return ann
spfnet = get_spfnet()
#spfnet.summary()
hist = spfnet.fit(X_train,batch_size=32,epochs=250,verbose=2)
精度图和损耗图是
plt.plot(hist.history['calc_accu'])
plt.title('Model Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.show()
plt.plot(hist.history['root_mean_squared_error'])
plt.title('Model error')
plt.xlabel('Epochs')
plt.ylabel('error')
plt.show()
50个时期后,似乎没有任何改善,数据似乎都没有过拟合 我尝试了其他模型,例如使用
减少层数和删除内核正则化kernel_initlizers='normal' and 'he-normal'
但是他们表现不佳,停留在20%。
解决方法
最常见的原因是接近零的梯度。您可能会停留在局部最小值或鞍点处。请尝试增加batch_size
(https://www.tensorflow.org/api_docs/python/tf/keras/Sequential/#evaluate)