Pytorch - 对象定位损失

问题描述

我正在尝试根据 Andrew Ng 的讲座 here 使用 MNIST 执行对象定位任务。我将 MNIST 数字随机放入一个 90x90 形状的图像中并预测数字及其中心点。当我训练时，我得到的结果很差，我的问题是我的损失函数是否设置正确。我基本上只取数字的交叉熵，坐标的 MSE，以及然后将它们全部加起来。这样对吗？我没有收到任何错误，但性能非常糟糕。

我的数据集定义如下（返回标签和数字中心的 x y 坐标）：

class CustomMnistDataset_OL(Dataset):

def __init__(self,df,test=False):
    '''
    df is a pandas dataframe with 28x28 columns for each pixel value in MNIST
    '''
    self.df = df
    self.test = test

def __len__(self):
    return len(self.df)

def __getitem__(self,idx):
    if self.test:
        image = np.reshape(np.array(self.df.iloc[idx,:]),(28,28)) / 255.
    else:
        image = np.reshape(np.array(self.df.iloc[idx,1:]),28)) / 255.
    
    # create the new image
    new_img = np.zeros((90,90)) # images will be 90x90
    # randomly select a bottom left corner to use for img
    x_min,y_min = randrange(90 - image.shape[0]),randrange(90 - image.shape[0])
    x_max,y_max = x_min + image.shape[0],y_min + image.shape[0]
    
    x_center = x_min + (x_max-x_min)/2
    y_center = y_min + (y_max-x_min)/2

    new_img[x_min:x_max,y_min:y_max] = image
    
    label = [int(self.df.iloc[idx,0]),x_center,y_center] # the label consists of the digit and the center of the number
    sample = {"image": new_img,"label": label}
    
    return sample['image'],sample['label']

我的训练函数设置如下：

loss_fn = nn.CrossEntropyLoss()
loss_mse = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),lr=0.001)

def train(DataLoader,model,loss_fn,loss_mse,optimizer):
    model.train() # very important... This turns the model back to training mode
    size = len(train_DataLoader.dataset)
    for batch,(X,y) in enumerate(DataLoader):
        X,y0,y1,y2 = X.to(device),y[0].to(device),y[1].to(device),y[2].to(device)

        pred = model(X.float())
        # DEFINE LOSS HERE -------
        loss = loss_fn(pred[0],y0) + loss_mse(pred[1],y1.float()) + loss_mse(pred[2],y2.float())

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss,current = loss.item(),batch*len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

backpropagation deep-learning gradient-descent pytorch