岭回归与多项式核为 1 的 SVM 回归器 (SVR) 之间的差异

问题描述

我正在尝试为应用程序构建模型，我使用了 sklearn 的岭回归和 SVR，虽然我试图保持参数相同，但它们看起来有所不同。

我在两个模型中都使用了正则化参数 = 1。（它们都有 L2 正则化）多边形内核有一个额外的参数，我将其设置为零

数据是标准化的。

from sklearn.linear_model import Ridge

linear_ridge = Ridge(alpha=1.0) # L2 regularization
linear_ridge.fit(np.array(X_train),np.array(y_train))

from sklearn import svm

model_SVR_poly = svm.SVR(kernel = 'poly',coef0=0.0,degree = 1,C = 1.0,epsilon = 0.1 ) #L2 regularization
model_SVR_poly.fit(np.array(X_train),np.array(y_train))


Linear_ridge_pred = linear_ridge.predict(test_data[start_data:]) *Y_std[0] + Y_mean[0]
svr_poly_pred =  model_SVR_poly.predict(test_data[start_data:]) *Y_std[0] + Y_mean[0]

如果将 epsilon 的值减小到 0.0，它会比脊线下冲更多，如果增加，它会过冲更多。

在测试阶段，Ridge 似乎过冲，而 SVR 似乎过冲。

在我的案例中或一般情况下，这两种实现之间有什么区别？

解决方法

对我来说，正如您所指出的，Ridge() 和 SVR() 的实现可能存在一些差异。

一方面，损失函数存在差异，您可能会看到 here（epsilon-insensitive loss 和平方 epsilon-insensitive loss）与 here（Ridge loss）。 sklearn 文档的 this example 中也强调了这一点，该文档将内核岭回归和 SVR 与非线性内核进行了比较。

除此之外，您使用带有 1 次多项式内核的 SVR 的事实进一步增加了差异：正如您所看到的 here 和 here（SVR 建立在LibSVM 库）还有一个需要考虑的参数 (gamma)（为了方便起见，您可以将它等于 1，它等于 'scale' by default）。

这是我通过调整 this toy example（使用未调整的参数）可以获得的拟合差异。我还尝试考虑 LinearSVR() 与 SVR() 有一些进一步的差异，如您所见，例如 here 或 here。

print(__doc__)

import numpy as np
from sklearn.linear_model import Ridge
from sklearn.svm import LinearSVR,SVR
import matplotlib.pyplot as plt
np.random.seed(42)

# #############################################################################
# Generate sample data
X = np.sort(5 * np.random.rand(40,1),axis=0)
y = np.sin(X).ravel()

# #############################################################################
# Add noise to targets
y[::5] += 3 * (0.5 - np.random.rand(8))

# #############################################################################
# Fit regression model
svr_lin = SVR(kernel='linear',C=1,tol=1e-5)
svr_lins = LinearSVR(loss='squared_epsilon_insensitive',tol=1e-5,random_state=42)
svr_poly = SVR(kernel='poly',degree=1,gamma=1,coef0=0.0)
ridge = Ridge(alpha=1,random_state=42)
y_lin = svr_lin.fit(X,y).predict(X)
y_lins = svr_lins.fit(X,y).predict(X)
y_poly = svr_poly.fit(X,y).predict(X)
y_ridge = ridge.fit(X,y).predict(X)

coef_y_lin,intercept_y_lin = svr_lin.coef_,svr_lin.intercept_
coef_y_lins,intercept_y_lins = svr_lins.coef_,svr_lins.intercept_
coef_y_ridge,intercept_y_ridge = ridge.coef_,ridge.intercept_

# #############################################################################
# Look at the results
lw = 2
plt.figure(figsize=(10,5))
plt.scatter(X,y,color='darkorange',label='data')
plt.plot(X,y_lins,color='navy',lw=lw,label='Linear model (LinearSVR) %s,%s' % 
(coef_y_lins,intercept_y_lins))
plt.plot(X,y_lin,color='red',label='Linear model (SVR) %s,%s' % (coef_y_lin,intercept_y_lin))
plt.plot(X,y_poly,color='cornflowerblue',label='Polynomial model of degree 1 (SVR)')
plt.plot(X,y_ridge,color='g',label='Ridge %s,%s' % (coef_y_ridge,intercept_y_ridge))
plt.xlabel('data')
plt.ylabel('target')
plt.title('Support Vector Regression')
plt.legend()
plt.axis([0,5,-1,1.5])

machine-learning python regression scikit-learn svm svm