问题描述
我正在尝试为逻辑回归实现小批量梯度下降。但是,当我尝试使用标签{-1,1}在我的数据集上对其进行测试时,似乎我的预测几乎总是1或-1,这给我留下了大约50%的测试分数(因为真正的标签大约是)。目标高于95%时在-1和1之间为50/50。
def logistic(z):
"""
Helper function
Computes the logistic function 1/(1+e^{-x}) to each entry in input vector z.
Args:
z: numpy array shape (,d)
Returns:
logi: numpy array shape (,d) each entry transformed by the logistic function
"""
logi = np.zeros(z.shape)
logi = np.array([1 / (1+np.exp(-z[i])) for i in range(len(z))])
assert logi.shape == z.shape
return logi
class LogisticRegressionClassifier():
def __init__(self):
self.w = None
def fit(self,X,y,w=None,lr=0.1,batch_size=16,epochs=10):
"""
Run mini-batch Gradient Descent for logistic regression
use batch_size data points to compute gradient in each step.
Args:
X: np.array shape (n,d) dtype float32 - Features
y: np.array shape (,n) dtype int32 - Labels
w: np.array shape (,d) dtype float32 - Initial parameter vector
lr: scalar - learning rate for gradient descent
batch_size: number of elements to use in minibatch
epochs: Number of scans through the data
sets:
w: numpy array shape (,d) learned weight vector w
history: list/np.array len epochs
"""
if w is None: w = np.zeros(X.shape[1])
history = []
n = np.size(X,0)
for i in range(epochs):
b = batch_size
X_ = np.copy(X)
X_shuf = np.take(X_,np.random.permutation(X_.shape[0]),axis=0,out=X_)
for i in range(n//b):
sample = X_shuf[b*i:(i+1)*b]
g = (1/b)*sum([-y[i]*sample[i,:]*sigmoid(-y[i]*np.dot(w,sample[i,:])) for i in range(b)])
w = np.array(w - lr*g)
history.append(w)
self.w = w
self.history = history
return w
def predict(self,X):
""" Classify each data element in X
Args:
X: np.array shape (n,d) dtype float - Features
Returns:
p: numpy array shape (n,) dtype int32,class predictions on X (0,1)
"""
z = np.dot(X,self.w.T)
print(z)
out = logistic(z)
return out
def score(self,y):
""" Compute model accuracy on Data X with labels y
Args:
X: np.array shape (n,d) dtype float - Features
y: np.array shape (n,) dtype int - Labels
Returns:
s: float,number of correct prediction divivded by n.
"""
s = 0
n = np.size(X,0)
pred = self.predict(X)
pred_labels = []
for i in range(n):
if pred[i] > 0.5:
pred_labels += [1]
if pred[i] <= 0.5:
pred_labels += [-1]
for i in range(n):
if pred_labels[i] == y[i]:
s += 1
return s / n
```
解决方法
您忘记将标签与训练数据一起打乱。如果你有
[3,1] [-1]
[2,3] [ 1]
打乱训练数据后,标签不匹配
[2,3] [-1]
[3,1] [ 1]