用于Logistic回归的Mini Batch Gradient Descent的错误预测？

问题描述

我正在尝试为逻辑回归实现小批量梯度下降。但是，当我尝试使用标签{-1，1}在我的数据集上对其进行测试时，似乎我的预测几乎总是1或-1，这给我留下了大约50％的测试分数（因为真正的标签大约是）。目标高于95％时在-1和1之间为50/50。

任何人都可以在下面的代码中找到错误吗？

def logistic(z):
    """ 
    Helper function
    Computes the logistic function 1/(1+e^{-x}) to each entry in input vector z.
    
    Args:
        z: numpy array shape (,d) 
    Returns:
       logi: numpy array shape (,d) each entry transformed by the logistic function 
    """
    logi = np.zeros(z.shape)
    logi = np.array([1 / (1+np.exp(-z[i])) for i in range(len(z))])
    assert logi.shape == z.shape
    return logi

class LogisticRegressionClassifier():

    def __init__(self):
        self.w = None


    def fit(self,X,y,w=None,lr=0.1,batch_size=16,epochs=10):
        """
        Run mini-batch Gradient Descent for logistic regression 
        use batch_size data points to compute gradient in each step.
   
        Args:
           X: np.array shape (n,d) dtype float32 - Features 
           y: np.array shape (,n) dtype int32 - Labels 
           w: np.array shape (,d) dtype float32 - Initial parameter vector
           lr: scalar - learning rate for gradient descent
           batch_size: number of elements to use in minibatch
           epochs: Number of scans through the data

        sets: 
           w: numpy array shape (,d) learned weight vector w
           history: list/np.array len epochs
        """
        if w is None: w = np.zeros(X.shape[1])
        history = []        
        n = np.size(X,0)
        for i in range(epochs):
            b = batch_size
            X_ = np.copy(X)
            X_shuf = np.take(X_,np.random.permutation(X_.shape[0]),axis=0,out=X_)
            for i in range(n//b):
                sample = X_shuf[b*i:(i+1)*b]
                g = (1/b)*sum([-y[i]*sample[i,:]*sigmoid(-y[i]*np.dot(w,sample[i,:])) for i in range(b)])
                w = np.array(w - lr*g)
            history.append(w)
        self.w = w
        self.history = history
        return w


    def predict(self,X):
        """ Classify each data element in X

        Args:
            X: np.array shape (n,d) dtype float - Features 
        
        Returns: 
           p: numpy array shape (n,) dtype int32,class predictions on X (0,1)

        """
        z = np.dot(X,self.w.T)
        print(z)
        out = logistic(z)
        return out
    
    def score(self,y):
        """ Compute model accuracy  on Data X with labels y

        Args:
            X: np.array shape (n,d) dtype float - Features 
            y: np.array shape (n,) dtype int - Labels 

        Returns: 
           s: float,number of correct prediction divivded by n.

        """
        s = 0
        n = np.size(X,0)
        pred = self.predict(X)
        pred_labels = []
        for i in range(n):
            if pred[i] > 0.5:
                pred_labels += [1]
            if pred[i] <= 0.5:
                pred_labels += [-1]
        for i in range(n):
            if pred_labels[i] == y[i]:
                s += 1
        return s / n
```

解决方法

您忘记将标签与训练数据一起打乱。如果你有

[3,1] [-1] 
[2,3] [ 1]

打乱训练数据后，标签不匹配

[2,3] [-1] 
[3,1] [ 1]

gradient-descent machine-learning python regression