问题描述
如何找到要传递给transforms.normalize函数的值?另外,在我的代码中,我应该准确地进行转换吗?
由于标准化数据集是一项众所周知的任务,因此我希望应该有某种脚本可以自动执行此操作。至少我在PyTorch论坛上找不到它。
transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',root_dir='.',transform=transforms.Compose([
Rescale(256),RandomCrop(224),transforms.normalize(mean = [ 0.485,0.456,0.406 ],std = [ 0.229,0.224,0.225 ]),ToTensor()
]))
for i in range(len(transformed_dataset)):
sample = transformed_dataset[i]
print(i,sample['image'].size(),sample['landmarks'].size())
if i == 3:
break
我知道这些当前值与我的数据集无关,与ImageNet无关,但是使用它们实际上却出现了错误:
TypeError Traceback (most recent call last)
<ipython-input-81-eb8dc46e0284> in <module>
10
11 for i in range(len(transformed_dataset)):
---> 12 sample = transformed_dataset[i]
13
14 print(i,sample['landmarks'].size())
<ipython-input-48-9d04158922fb> in __getitem__(self,idx)
30
31 if self.transform:
---> 32 sample = self.transform(sample)
33
34 return sample
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self,img)
59 def __call__(self,img):
60 for t in self.transforms:
---> 61 img = t(img)
62 return img
63
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/transforms.py in __call__(self,tensor)
210 Tensor: normalized Tensor image.
211 """
--> 212 return F.normalize(tensor,self.mean,self.std,self.inplace)
213
214 def __repr__(self):
~/anaconda3/lib/python3.7/site-packages/torchvision/transforms/functional.py in normalize(tensor,mean,std,inplace)
278 """
279 if not torch.is_tensor(tensor):
--> 280 raise TypeError('tensor should be a torch tensor. Got {}.'.format(type(tensor)))
281
282 if tensor.ndimension() != 3:
TypeError: tensor should be a torch tensor. Got <class 'dict'>.
所以基本上是三个问题:
- 如何为自己的自定义数据集找到与ImageNet mean和std中相似的值?
- 如何在哪里传递这些值?我以为我应该在transforms.Compose方法中这样做,但是我可能是错的。
- 我想我应该对整个数据集而不是训练集应用normalize,对吗?
更新:
在这里尝试提供的解决方案对我不起作用:https://discuss.pytorch.org/t/about-normalization-using-pre-trained-vgg16-networks/23560/6?u=mona_jalal
mean = 0.
std = 0.
nb_samples = 0.
for data in DataLoader:
print(type(data))
batch_samples = data.size(0)
data.shape(0)
data = data.view(batch_samples,data.size(1),-1)
mean += data.mean(2).sum(0)
std += data.std(2).sum(0)
nb_samples += batch_samples
mean /= nb_samples
std /= nb_samples
错误是:
<class 'dict'>
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-51-e8ba3c8718bb> in <module>
5 for data in DataLoader:
6 print(type(data))
----> 7 batch_samples = data.size(0)
8
9 data.shape(0)
AttributeError: 'dict' object has no attribute 'size'
这是打印(数据)结果:
{'image': tensor([[[[0.2961,0.2941,...,0.2460,0.2456,0.2431],[0.2953,0.2977,0.2980,0.2442,0.2431,[0.2941,0.2471,0.2448],[0.3216,0.3216,0.2482,0.2471],0.3241,0.3253,0.2450],0.2452,0.2431]],[[0.2961,0.2431]]],[[[0.3059,0.3093,0.3140,0.3373,0.3363,0.3345],[0.3059,0.3165,0.3412,0.3389,0.3373],[0.3098,0.3131,0.3176,0.3450,0.3412],[0.2931,0.2966,0.2931,0.2549,0.2539,0.2510],[0.2902,0.2902,0.2510,0.2502],[0.2864,0.2900,0.2863,0.2510]],[[0.3059,0.2510]]],[[[0.2979,0.3015,0.2825,0.2784,0.2784],[0.2980,0.2830,0.2764,0.2795],0.3012,0.2827,0.2814,0.2797],[0.3282,0.3293,0.3294,0.2238,0.2235,0.2235],[0.3255,0.3255,0.2240,0.2229],[0.3225,0.2216,0.2223]],[[0.2979,0.2223]]]],dtype=torch.float64),'landmarks': tensor([[[160.2964,98.7339],[223.0788,72.5067],[ 82.4163,70.3733],[152.3213,137.7867]],[[198.3194,74.4341],[273.7188,118.7733],[117.7113,80.8000],[182.0750,107.2533]],[[137.4789,92.8523],[174.9463,40.3467],[ 57.3013,59.1200],[129.3375,131.6533]]],dtype=torch.float64)}
DataLoader = DataLoader(transformed_dataset,batch_size=3,shuffle=True,num_workers=4)
和
transformed_dataset = MothLandmarksDataset(csv_file='moth_gt.csv',transform=transforms.Compose(
[
Rescale(256),ToTensor()#,##transforms.normalize(mean = [ 0.485,## std = [ 0.229,0.225 ])
]
)
)
和
class MothLandmarksDataset(Dataset):
"""Face Landmarks dataset."""
def __init__(self,csv_file,root_dir,transform=None):
"""
Args:
csv_file (string): Path to the csv file with annotations.
root_dir (string): Directory with all the images.
transform (callable,optional): Optional transform to be applied
on a sample.
"""
self.landmarks_frame = pd.read_csv(csv_file)
self.root_dir = root_dir
self.transform = transform
def __len__(self):
return len(self.landmarks_frame)
def __getitem__(self,idx):
if torch.is_tensor(idx):
idx = idx.tolist()
img_name = os.path.join(self.root_dir,self.landmarks_frame.iloc[idx,0])
image = io.imread(img_name)
landmarks = self.landmarks_frame.iloc[idx,1:]
landmarks = np.array([landmarks])
landmarks = landmarks.astype('float').reshape(-1,2)
sample = {'image': image,'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
return sample
解决方法
源代码错误
如何在哪里传递这些值?我想我应该在 transforms.Compose方法,但我可能错了。
在MothLandmarksDataset
中,难怪它无法正常工作,因为您试图将Dict
或sample
(torchvision.transforms
)传递给torch.Tensor
PIL.Image
作为输入。确切地说是这样:
sample = {'image': image,'landmarks': landmarks}
if self.transform:
sample = self.transform(sample)
您可以将sample["image"]
传递给它,尽管您不应该。仅将此操作应用于sample["image"]
会中断其与landmarks
的关系。您应该追求的是像albumentations
库(请参阅here)之类的东西,它可以以相同的方式转换image
和landmarks
来保持它们之间的关系。
在Rescale
中还没有torchvision
变换,也许您是说Resize?
标准化的均值和方差
提供的代码很好,但是您必须像这样将数据解压缩到torch.Tensor
中:
mean = 0.0
std = 0.0
nb_samples = 0.0
for data in dataloader:
images,landmarks = data["image"],data["landmarks"]
batch_samples = images.size(0)
images_data = images.view(batch_samples,images.size(1),-1)
mean += images_data.mean(2).sum(0)
std += images_data.std(2).sum(0)
nb_samples += batch_samples
mean /= nb_samples
std /= nb_samples
如何在哪里传递这些值?我想我应该在 transforms.Compose方法,但我可能错了。
这些值应仅应用于torchvision.transforms.Normalize
而不应用于sample["images"]
的{{1}}传递。
我认为我应该将Normalize应用于整个数据集,而不仅仅是 训练集,对吗?
您应该跨训练数据集计算归一化值,并将这些计算出的值也应用于验证和测试。