Mask R-CNN 在自定义数据集上不产生任何输出

问题描述

我在自定义数据集上运行 Mask R-CNN (https://github.com/matterport/Mask_RCNN)，我已使用 COCO 注释器对其进行注释并输出为 .json 文件。

.py 脚本运行，但没有模型输出（.h5 文件到日志目录）。我不知道我做错了什么。我也在 Google Collabs 笔记本中尝试过这个。同样的问题。这是我的输出，然后是我的代码。任何帮助都会很棒，谢谢！

使用 TensorFlow 后端。

权重：可可

数据集：/samples/lithic/dataset

日志：/Users/Mask_RCNN/logs

配置：

主干 resnet101

BACKBONE_STRIDES [4,8,16,32,64]

BATCH_SIZE 2

BBox_STD_DEV [0.1 0.1 0.2 0.2]

COmpuTE_BACKBONE_SHAPE 无

DETECTION_MAX_INSTANCES 100

DETECTION_MIN_CONFIDENCE 0.9

DETECTION_NMS_THRESHOLD 0.3

FPN_CLASSIF_FC_LAYERS_SIZE 1024

GPU_COUNT 1

GRADIENT_CLIP_norM 5.0

IMAGES_PER_GPU 2

IMAGE_CHANNEL_COUNT 3

IMAGE_MAX_DIM 1024

IMAGE_Meta_SIZE 14

IMAGE_MIN_DIM 800

IMAGE_MIN_SCALE 0

IMAGE_RESIZE_MODE 方格

IMAGE_SHAPE [1024 1024 3]

LEARNING_MOMENTUM 0.9

LEARNING_RATE 0.001

LOSS_WEIGHTS {'rpn_class_loss'：1.0，'rpn_bBox_loss'：1.0，'mrcnn_class_loss'：1.0，'mrcnn_bBox_loss'：1.0，'mrcnn_mask_loss'：1.0}

MASK_POOL_SIZE 14

MASK_SHAPE [28,28]

MAX_GT_INSTANCES 100

MEAN_PIXEL [123.7 116.8 103.9]

MINI_MASK_SHAPE (56,56) 姓名疤痕

NUM_CLASSES 2

POOL_SIZE 7

POST_NMS_ROIS_INFERENCE 1000

POST_NMS_ROIS_TRAINING 2000

PRE_NMS_LIMIT 6000

ROI_POSITIVE_RATIO 0.33

rpn_ANCHOR_RATIOS [0.5,1,2]

rpn_ANCHOR_SCALES (32,64,128,256,512)

rpn_ANCHOR_STRIDE 1

rpn_BBox_STD_DEV [0.1 0.1 0.2 0.2]

rpn_NMS_THRESHOLD 0.7

rpn_TRAIN_ANCHORS_PER_IMAGE 256

STEPS_PER_EPOCH 100

TOP_DOWN_PYRAMID_SIZE 256

TRAIN_BN 错误

TRAIN_ROIS_PER_IMAGE 200

USE_MINI_MASK 真

USE_rpn_ROIS 真 VALIDATION_STEPS 50

WEIGHT_DECAY 0.0001

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:442：名称 tf.placeholder 已弃用。请改用 tf.compat.v1.placeholder。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:58：名称 tf.get_default_graph 已弃用。请改用 tf.compat.v1.get_default_graph。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3543：名称 tf.random_uniform 已弃用。请改用 tf.random.uniform。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3386：名称 tf.nn.max_pool 已弃用。请改用 tf.nn.max_pool2d。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1768：名称 tf.image.resize_nearest_neighbor 已弃用。请改用 tf.compat.v1.image.resize_nearest_neighbor。

警告：tensorflow：来自/Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1154：调用reduce_max_v1（来自tensorflow.python.ops.math_ops ) 与 keep_dims 已弃用，并将在未来版本中删除。更新说明： keep_dims 已弃用，请改用 keepdims

警告：tensorflow：来自/Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1188：调用reduce_sum_v1（来自tensorflow.python.ops.math_ops ) 与 keep_dims 已弃用，并将在未来版本中删除。更新说明： keep_dims 已弃用，请改用 keepdims

警告：tensorflow：来自/Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/tensorflow_core/python/ops/array_ops.py:1475：哪里（来自tensorflow.python.ops。 array_ops) 已弃用，并将在未来版本中删除。更新说明：使用2.0中的tf.where，与np.where的广播规则相同

警告：tensorflow：来自 /Users/Mask_RCNN/samples/lithic/mrcnn/model.py:553：名称 tf.random_shuffle 已弃用。请改用 tf.random.shuffle。

警告：tensorflow：来自 /Users/Mask_RCNN/samples/lithic/mrcnn/utils.py:202：名称 tf.log 已弃用。请改用 tf.math.log。

警告：tensorflow：来自 /Users/Mask_RCNN/samples/lithic/mrcnn/model.py:600：使用 Box_ind 调用crop_and_resize_v1（来自tensorflow.python.ops.image_ops_impl）已被弃用，并将在未来版本中删除。更新说明： Box_ind 已弃用，请改用 Box_indices 加载权重 /Users/Mask_RCNN/mask_rcnn_coco.h5

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:153：名称 tf.get_default_session 已弃用。请改用 tf.compat.v1.get_default_session。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:158：名称 tf.ConfigProto 已弃用。请改用 tf.compat.v1.ConfigProto。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:163：名称 tf.Session 已弃用。请改用 tf.compat.v1.Session。

2021-03-05 19:07:53.441166: I tensorflow/core/platform/cpu_feature_guard.cc:142] 您的 cpu 支持该 TensorFlow 二进制文件未编译使用的指令：AVX2 FMA 2021-03-05 19:07:53.535883: I tensorflow/compiler/xla/service/service.cc:168] XLA 服务 0x7fe9f5f26f20 为平台主机初始化（这不保证会使用 XLA）。设备： 2021-03-05 19:07:53.535913：I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor 设备（0）：主机，默认版本

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:333：名称 tf.global_variables 已弃用。请改用 tf.compat.v1.global_variables。

警告：tensorflow：来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:341：名称 tf.variables_initializer 已弃用。请改用 tf.compat.v1.variables_initializer。

我的代码 -

"""
Mask R-CNN


copyright (c) 2018 Matterport,Inc.
Licensed under the MIT License (see LICENSE for details)
Written by Waleed Abdulla

------------------------------------------------------------


    # Train a new model starting from pre-trained COCO weights
    python3 python lithicRCNN.py train --dataset=/samples/lithic/dataset --weights=coco

    # Resume training a model that you had trained earlier
    python3 lithicRCNN.py train --dataset=/path/to/lithic/dataset --weights=last

    # Train a new model starting from ImageNet weights
    python3 lithicRCNN.py train --dataset=/path/to/lithic/dataset --weights=imagenet

    # Apply color splash to an image
    python3 lithicRCNN.py splash --weights=/path/to/weights/file.h5 --image=<URL or path to file>

    # Apply color splash to video using the last weights you trained
    python3 lithicRCNN.py splash --weights=last --video=<URL or path to file>
"""

import os
import sys
import json
import datetime
import numpy as np
import skimage.draw

# Root directory of the project
ROOT_DIR = os.path.abspath("/Users/Mask_RCNN/")

# Import Mask RCNN
sys.path.append(ROOT_DIR)  # To find local version of the library
from mrcnn.config import Config
from mrcnn import model as modellib,utils

# Path to trained weights file
COCO_WEIGHTS_PATH = os.path.join(ROOT_DIR,"mask_rcnn_coco.h5")

# Directory to save logs and model checkpoints,if not provided
# through the command line argument --logs
DEFAULT_LOGS_DIR = os.path.join(ROOT_DIR,"logs")


############################################################
#  Configurations
############################################################


class CustomConfig(Config):
    """Configuration for training on the toy  dataset.
    Derives from the base Config class and overrides some values.
    """
    # Give the configuration a recognizable name
    NAME = "scar"

    # We use a GPU with 12GB memory,which can fit two images.
    # Adjust down if you use a smaller GPU.
    IMAGES_PER_GPU = 2

    # Number of classes (including background)
    NUM_CLASSES = 1 + 1  # Background + lithic

    # Number of training steps per epoch
    STEPS_PER_EPOCH = 100

    # Skip detections with < 90% confidence
    DETECTION_MIN_CONFIDENCE = 0.9


############################################################
#  Dataset
############################################################

class CustomDataset(utils.Dataset):

    def load_custom(self,dataset_dir,subset):
        """Load a subset of the lithic dataset.
        dataset_dir: Root directory of the dataset.
        subset: Subset to load: train or val
        """
        # Add classes. We have only one class to add.
        self.add_class("scar","scar")

        # Train or validation dataset?
        assert subset in ["train","val"]
        dataset_dir = os.path.join(dataset_dir,subset)

        # Load annotations
        # VGG Image Annotator (up to version 1.6) saves each image in the form:
        # { 'filename': '28503151_5b5b7ec140_b.jpg',#   'regions': {
        #       '0': {
        #           'region_attributes': {},#           'shape_attributes': {
        #               'all_points_x': [...],#               'all_points_y': [...],#               'name': 'polygon'}},#       ... more regions ...
        #   },#   'size': 100202
        # }
        # We mostly care about the x and y coordinates of each region
        # Note: In VIA 2.0,regions was changed from a dict to a list.
        annotations = json.load(open(os.path.join(dataset_dir,"via_region_data.json")))
        annotations = list(annotations.values())  # don't need the dict keys

        # The VIA tool saves images in the JSON even if they don't have any
        # annotations. Skip unannotated images.
        annotations = [a for a in annotations if a['regions']]

        # Add images
        for a in annotations:
            # Get the x,y coordinaets of points of the polygons that make up
            # the outline of each object instance. These are stores in the
            # shape_attributes (see json format above)
            # The if condition is needed to support VIA versions 1.x and 2.x.
            if type(a['regions']) is dict:
                polygons = [r['shape_attributes'] for r in a['regions'].values()]
            else:
                polygons = [r['shape_attributes'] for r in a['regions']] 

            # load_mask() needs the image size to convert polygons to masks.
            # Unfortunately,VIA doesn't include it in JSON,so we must read
            # the image. This is only managable since the dataset is tiny.
            image_path = os.path.join(dataset_dir,a['filename'])
            image = skimage.io.imread(image_path)
            height,width = image.shape[:2]

            self.add_image(
                "scar",image_id=a['filename'],# use file name as a unique image id
                path=image_path,width=width,height=height,polygons=polygons)

    def load_mask(self,image_id):
        """Generate instance masks for an image.
       Returns:
        masks: A bool array of shape [height,width,instance count] with
            one mask per instance.
        class_ids: a 1D array of class IDs of the instance masks.
        """
        # If not a lithic dataset image,delegate to parent class.
        image_info = self.image_info[image_id]
        if image_info["source"] != "scar":
            return super(self.__class__,self).load_mask(image_id)

        # Convert polygons to a bitmap mask of shape
        # [height,instance_count]
        info = self.image_info[image_id]
        mask = np.zeros([info["height"],info["width"],len(info["polygons"])],dtype=np.uint8)
        for i,p in enumerate(info["polygons"]):
            # Get indexes of pixels inside the polygon and set them to 1
            rr,cc = skimage.draw.polygon(p['all_points_y'],p['all_points_x'])
            mask[rr,cc,i] = 1

        # Return mask,and array of class IDs of each instance. Since we have
        # one class ID only,we return an array of 1s
        return mask.astype(np.bool),np.ones([mask.shape[-1]],dtype=np.int32)

    def image_reference(self,image_id):
        """Return the path of the image."""
        info = self.image_info[image_id]
        if info["source"] == "scar":
            return info["path"]
        else:
            super(self.__class__,self).image_reference(image_id)


def train(model):
    """Train the model."""
    # Training dataset.
    dataset_train = CustomDataset()
    dataset_train.load_custom(args.dataset,"train")
    dataset_train.prepare()

    # Validation dataset
    dataset_val = CustomDataset()
    dataset_val.load_custom(args.dataset,"val")
    dataset_val.prepare()

    # *** This training schedule is an example. Update to your needs ***
    # Since we're using a very small dataset,and starting from
    # COCO trained weights,we don't need to train too long. Also,# no need to train all layers,just the heads should do it.
    print("Training network heads")
    model.train(dataset_train,dataset_val,learning_rate=config.LEARNING_RATE,epochs=30,layers='heads')


def color_splash(image,mask):
    """Apply color splash effect.
    image: RGB image [height,3]
    mask: instance segmentation mask [height,instance count]

    Returns result image.
    """
    # Make a grayscale copy of the image. The grayscale copy still
    # has 3 RGB channels,though.
    gray = skimage.color.gray2rgb(skimage.color.rgb2gray(image)) * 255
    # copy color pixels from the original color image where mask is set
    if mask.shape[-1] > 0:
        # We're treating all instances as one,so collapse the mask into one layer
        mask = (np.sum(mask,-1,keepdims=True) >= 1)
        splash = np.where(mask,image,gray).astype(np.uint8)
    else:
        splash = gray.astype(np.uint8)
    return splash


def detect_and_color_splash(model,image_path=None,video_path=None):
    assert image_path or video_path

    # Image or video?
    if image_path:
        # Run model detection and generate the color splash effect
        print("Running on {}".format(args.image))
        # Read image
        image = skimage.io.imread(args.image)
        # Detect objects
        r = model.detect([image],verbose=1)[0]
        # Color splash
        splash = color_splash(image,r['masks'])
        # Save output
        file_name = "splash_{:%Y%m%dT%H%M%s}.png".format(datetime.datetime.Now())
        skimage.io.imsave(file_name,splash)
    elif video_path:
        import cv2
        # Video capture
        vcapture = cv2.VideoCapture(video_path)
        width = int(vcapture.get(cv2.CAP_PROP_FRAME_WIDTH))
        height = int(vcapture.get(cv2.CAP_PROP_FRAME_HEIGHT))
        fps = vcapture.get(cv2.CAP_PROP_FPS)

        # Define codec and create video writer
        file_name = "splash_{:%Y%m%dT%H%M%s}.avi".format(datetime.datetime.Now())
        vwriter = cv2.VideoWriter(file_name,cv2.VideoWriter_fourcc(*'MJPG'),fps,(width,height))

        count = 0
        success = True
        while success:
            print("frame: ",count)
            # Read next image
            success,image = vcapture.read()
            if success:
                # OpenCV returns images as BGR,convert to RGB
                image = image[...,::-1]
                # Detect objects
                r = model.detect([image],verbose=0)[0]
                # Color splash
                splash = color_splash(image,r['masks'])
                # RGB -> BGR to save image to video
                splash = splash[...,::-1]
                # Add image to video writer
                vwriter.write(splash)
                count += 1
        vwriter.release()
    print("Saved to ",file_name)


############################################################
#  Training
############################################################

if __name__ == '__main__':
    import argparse

    # Parse command line arguments
    parser = argparse.ArgumentParser(
        description='Train Mask R-CNN to detect lithic scars.')
    parser.add_argument("command",Metavar="<command>",help="'train' or 'splash'")
    parser.add_argument('--dataset',required=False,Metavar="/path/to/lithic/dataset/",help='Directory of the Lithic dataset')
    parser.add_argument('--weights',required=True,Metavar="/path/to/weights.h5",help="Path to weights .h5 file or 'coco'")
    parser.add_argument('--logs',default=DEFAULT_LOGS_DIR,Metavar="/path/to/logs/",help='Logs and checkpoints directory (default=logs/)')
    parser.add_argument('--image',Metavar="path or URL to image",help='Image to apply the color splash effect on')
    parser.add_argument('--video',Metavar="path or URL to video",help='Video to apply the color splash effect on')
    args = parser.parse_args()

    # Validate arguments
    if args.command == "train":
        assert args.dataset,"Argument --dataset is required for training"
    elif args.command == "splash":
        assert args.image or args.video,\
               "Provide --image or --video to apply color splash"

    print("Weights: ",args.weights)
    print("Dataset: ",args.dataset)
    print("Logs: ",args.logs)

    # Configurations
    if args.command == "train":
        config = CustomConfig()
    else:
        class InferenceConfig(CustomConfig):
            # Set batch size to 1 since we'll be running inference on
            # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
            GPU_COUNT = 1
            IMAGES_PER_GPU = 1
        config = InferenceConfig()
    config.display()

    # Create model
    if args.command == "train":
        model = modellib.MaskRCNN(mode="training",config=config,model_dir=args.logs)
    else:
        model = modellib.MaskRCNN(mode="inference",model_dir=args.logs)

    # Select weights file to load
    if args.weights.lower() == "coco":
        weights_path = COCO_WEIGHTS_PATH
        # Download weights file
        if not os.path.exists(weights_path):
            utils.download_trained_weights(weights_path)
    elif args.weights.lower() == "last":
        # Find last trained weights
        weights_path = model.find_last()
    elif args.weights.lower() == "imagenet":
        # Start from ImageNet trained weights
        weights_path = model.get_imagenet_weights()
    else:
        weights_path = args.weights

    # Load weights
    print("Loading weights ",weights_path)
    if args.weights.lower() == "coco":
        # Exclude the last layers because they require a matching
        # number of classes
        model.load_weights(weights_path,by_name=True,exclude=[
            "mrcnn_class_logits","mrcnn_bBox_fc","mrcnn_bBox","mrcnn_mask"])
    else:
        model.load_weights(weights_path,by_name=True)

    # # Train or evaluate
    # if args.command == "train":
    #     train(model)
    # elif args.command == "splash":
    #     detect_and_color_splash(model,image_path=args.image,#                             video_path=args.video)
    # else:
    #     print("'{}' is not recognized. "
    #           "Use 'train' or 'splash'".format(args.command))

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

conv-neural-network faster-rcnn object-detection python

Mask R-CNN 在自定义数据集上不产生任何输出

问题描述

解决方法

相关问答