问题描述
我正在尝试根据 Andrew Ng 的讲座 here 使用 MNIST 执行对象定位任务。我将 MNIST 数字随机放入一个 90x90 形状的图像中并预测数字及其中心点。当我训练时,我得到的结果很差,我的问题是我的损失函数是否设置正确。我基本上只取数字的交叉熵,坐标的 MSE,以及然后将它们全部加起来。这样对吗?我没有收到任何错误,但性能非常糟糕。
我的数据集定义如下(返回标签和数字中心的 x y 坐标):
class CustomMnistDataset_OL(Dataset):
def __init__(self,df,test=False):
'''
df is a pandas dataframe with 28x28 columns for each pixel value in MNIST
'''
self.df = df
self.test = test
def __len__(self):
return len(self.df)
def __getitem__(self,idx):
if self.test:
image = np.reshape(np.array(self.df.iloc[idx,:]),(28,28)) / 255.
else:
image = np.reshape(np.array(self.df.iloc[idx,1:]),28)) / 255.
# create the new image
new_img = np.zeros((90,90)) # images will be 90x90
# randomly select a bottom left corner to use for img
x_min,y_min = randrange(90 - image.shape[0]),randrange(90 - image.shape[0])
x_max,y_max = x_min + image.shape[0],y_min + image.shape[0]
x_center = x_min + (x_max-x_min)/2
y_center = y_min + (y_max-x_min)/2
new_img[x_min:x_max,y_min:y_max] = image
label = [int(self.df.iloc[idx,0]),x_center,y_center] # the label consists of the digit and the center of the number
sample = {"image": new_img,"label": label}
return sample['image'],sample['label']
我的训练函数设置如下:
loss_fn = nn.CrossEntropyLoss()
loss_mse = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(),lr=0.001)
def train(DataLoader,model,loss_fn,loss_mse,optimizer):
model.train() # very important... This turns the model back to training mode
size = len(train_DataLoader.dataset)
for batch,(X,y) in enumerate(DataLoader):
X,y0,y1,y2 = X.to(device),y[0].to(device),y[1].to(device),y[2].to(device)
pred = model(X.float())
# DEFINE LOSS HERE -------
loss = loss_fn(pred[0],y0) + loss_mse(pred[1],y1.float()) + loss_mse(pred[2],y2.float())
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss,current = loss.item(),batch*len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)