如何在 pytorch 中为权重添加 L1 或 L2 正则化

问题描述

在 tensorflow 中,我们可以在序列模型中添加 L1 或 L2 正则化。我在 pytorch 中找不到等效的方法。我们如何在 pytorch 的网络定义中为权重添加正则化:

class Net(torch.nn.Module):
    def __init__(self,n_feature,n_hidden,n_output):
        super(Net,self).__init__()
        self.hidden = torch.nn.Linear(n_feature,n_hidden)   # hidden layer
        """ How to add a L1 regularization after a certain hidden layer?? """
        """ OR How to add a L1 regularization after a certain hidden layer?? """
        self.predict = torch.nn.Linear(n_hidden,n_output)   # output layer

    def forward(self,x):
        x = F.relu(self.hidden(x))      # activation function for hidden layer
        x = self.predict(x)             # linear output
        return x

net = Net(n_feature=1,n_hidden=10,n_output=1)     # define the network
# print(net)  # net architecture
optimizer = torch.optim.SGD(net.parameters(),lr=0.2)
loss_func = torch.nn.MSELoss()  # this is for regression mean squared loss

解决方法

通常 L2 正则化是通过 PyTorch 中优化器的 weight_decay 参数处理的(您可以 assign different arguments for different layers too)。但是,这种机制不允许在不扩展现有优化器或编写自定义优化器的情况下进行 L1 正则化。

根据 tensorflow docs,他们对 L1 正则化使用 reduce_sum(abs(x)) 惩罚,对 L2 正则化使用 reduce_sum(square(x)) 惩罚。可能最简单的方法是在训练期间直接将这些惩罚项添加到用于梯度计算的损失函数中。

# set l1_weight and l2_weight to non-zero values to enable penalties

# inside the training loop (given input x and target y)
...
pred = net(x)
loss = loss_func(pred,y)

# compute penalty only for net.hidden parameters
l1_penalty = l1_weight * sum([p.abs().sum() for p in net.hidden.parameters()])
l2_penalty = l2_weight * sum([(p**2).sum() for p in net.hidden.parameters()])
loss_with_penalty = loss + l1_penalty + l2_penalty

optimizer.zero_grad()
loss_with_penalty.backward()
optimizer.step()

# The pre-penalty loss is the one we ultimately care about
print('loss:',loss.item())