问题描述
我正在尝试使用 MS-COCO 格式的自定义数据集训练 MaskRCNN 图像分割模型。
我正在尝试使用多边形掩码作为输入,但无法使其适合我的模型的格式。
我的数据如下所示:
{"id": 145010,“image_id”:101953, “category_id”:1040,
“分割”:[[140.0,352.5,131.0,351.5,118.0,344.5,101.50000000000001,323.0,94.5,303.0,86.5,292.0,52.0,263.5,35.0,255.5,20.5,240.0,11.5,214.0,14.5 ,190.0,22.0,179.5,53.99999999999999,170.5,76.0,158.5,88.5,129.0,100.5,111.0,152.0,70.5,175.0,65.5,217.0,64.5,272.0,48.5,296.0,56.49999999999999,320.5,82.0,350.5,135.0 ,374.5,163.0,382.5,190.0,381.5,205.99999999999997,376.5,217.0,371.0,221.5,330.0,229.50000000000003,312.5,240.0,310.5,291.0,302.5,310.0,288.0,326.5,259.0,337.5,208.0,339.5,171.0,349.5]],
“区域”:73578.0,
"bBox": [11.5,11.5,341.0,371.0],
"iscrowd": 0}
我在这张图片中有一个对象,因此有一个用于分割和 bBox 的项目。分割值是多边形的像素,因此对于不同的对象有不同的大小。
有人能帮我解决这个问题吗?
解决方法
要管理 COCO 格式的数据集,您可以使用 this repo。它提供了可以从注释文件中实例化的类,使其非常易于使用和访问数据。
我不知道您使用的是哪种实现,但如果它类似于 this tutorial,这段代码至少可以为您提供一些有关如何解决问题的想法:
class CocoDataset(torch.utils.data.Dataset):
def __init__(self,dataset_dir,subset,transforms):
dataset_path = os.path.join(dataset_dir,subset)
ann_file = os.path.join(dataset_path,"annotation.json")
self.imgs_dir = os.path.join(dataset_path,"images")
self.coco = COCO(ann_file)
self.img_ids = self.coco.getImgIds()
self.transforms = transforms
def __getitem__(self,idx):
'''
Args:
idx: index of sample to be fed
return:
dict containing:
- PIL Image of shape (H,W)
- target (dict) containing:
- boxes: FloatTensor[N,4],N being the n° of instances and it's bounding
boxe coordinates in [x0,y0,x1,y1] format,ranging from 0 to W and 0 to H;
- labels: Int64Tensor[N],class label (0 is background);
- image_id: Int64Tensor[1],unique id for each image;
- area: Tensor[N],area of bbox;
- iscrowd: UInt8Tensor[N],True or False;
- masks: UInt8Tensor[N,H,W],segmantation maps;
'''
img_id = self.img_ids[idx]
img_obj = self.coco.loadImgs(img_id)[0]
anns_obj = self.coco.loadAnns(self.coco.getAnnIds(img_id))
img = Image.open(os.path.join(self.imgs_dir,img_obj['file_name']))
# list comprhenssion is too slow,might be better changing it
bboxes = [ann['bbox'] for ann in anns_obj]
# bboxes = ? from [x,y,w,h] to [x0,y1]
masks = [self.coco.annToMask(ann) for ann in anns_obj]
areas = [ann['area'] for ann in anns_obj]
boxes = torch.as_tensor(bboxes,dtype=torch.float32)
labels = torch.ones(len(anns_obj),dtype=torch.int64)
masks = torch.as_tensor(masks,dtype=torch.uint8)
image_id = torch.tensor([idx])
area = torch.as_tensor(areas)
iscrowd = torch.zeros(len(anns_obj),dtype=torch.int64)
target = {}
target["boxes"] = boxes
target["labels"] = labels
target["masks"] = masks
target["image_id"] = image_id
target["area"] = area
target["iscrowd"] = iscrowd
if self.transforms is not None:
img,target = self.transforms(img,target)
return img,target
def __len__(self):
return len(self.img_ids)
再说一次,这只是一个草稿,旨在提供提示。