问题描述
我在自定义数据集上运行 Mask R-CNN (https://github.com/matterport/Mask_RCNN),我已使用 COCO 注释器对其进行注释并输出为 .json 文件。
.py 脚本运行,但没有模型输出(.h5 文件到日志目录)。我不知道我做错了什么。我也在 Google Collabs 笔记本中尝试过这个。同样的问题。这是我的输出,然后是我的代码。任何帮助都会很棒,谢谢!
使用 TensorFlow 后端。
权重:可可
数据集:/samples/lithic/dataset
日志:/Users/Mask_RCNN/logs
配置:
主干 resnet101
BACKBONE_STRIDES [4,8,16,32,64]
BATCH_SIZE 2
BBox_STD_DEV [0.1 0.1 0.2 0.2]
COmpuTE_BACKBONE_SHAPE 无
DETECTION_MAX_INSTANCES 100
DETECTION_MIN_CONFIDENCE 0.9
DETECTION_NMS_THRESHOLD 0.3
FPN_CLASSIF_FC_LAYERS_SIZE 1024
GPU_COUNT 1
GRADIENT_CLIP_norM 5.0
IMAGES_PER_GPU 2
IMAGE_CHANNEL_COUNT 3
IMAGE_MAX_DIM 1024
IMAGE_Meta_SIZE 14
IMAGE_MIN_DIM 800
IMAGE_MIN_SCALE 0
IMAGE_RESIZE_MODE 方格
IMAGE_SHAPE [1024 1024 3]
LEARNING_MOMENTUM 0.9
LEARNING_RATE 0.001
LOSS_WEIGHTS {'rpn_class_loss':1.0,'rpn_bBox_loss':1.0,'mrcnn_class_loss':1.0,'mrcnn_bBox_loss':1.0,'mrcnn_mask_loss':1.0}
MASK_POOL_SIZE 14
MASK_SHAPE [28,28]
MAX_GT_INSTANCES 100
MEAN_PIXEL [123.7 116.8 103.9]
MINI_MASK_SHAPE (56,56) 姓名疤痕
NUM_CLASSES 2
POOL_SIZE 7
POST_NMS_ROIS_INFERENCE 1000
POST_NMS_ROIS_TRAINING 2000
PRE_NMS_LIMIT 6000
ROI_POSITIVE_RATIO 0.33
rpn_ANCHOR_RATIOS [0.5,1,2]
rpn_ANCHOR_SCALES (32,64,128,256,512)
rpn_ANCHOR_STRIDE 1
rpn_BBox_STD_DEV [0.1 0.1 0.2 0.2]
rpn_NMS_THRESHOLD 0.7
rpn_TRAIN_ANCHORS_PER_IMAGE 256
STEPS_PER_EPOCH 100
TOP_DOWN_PYRAMID_SIZE 256
TRAIN_BN 错误
TRAIN_ROIS_PER_IMAGE 200
USE_MINI_MASK 真
USE_rpn_ROIS 真 VALIDATION_STEPS 50
WEIGHT_DECAY 0.0001
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:442:名称 tf.placeholder 已弃用。请改用 tf.compat.v1.placeholder。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:58:名称 tf.get_default_graph 已弃用。请改用 tf.compat.v1.get_default_graph。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3543:名称 tf.random_uniform 已弃用。请改用 tf.random.uniform。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:3386:名称 tf.nn.max_pool 已弃用。请改用 tf.nn.max_pool2d。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1768:名称 tf.image.resize_nearest_neighbor 已弃用。请改用 tf.compat.v1.image.resize_nearest_neighbor。
警告:tensorflow:来自/Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1154:调用reduce_max_v1(来自tensorflow.python.ops.math_ops ) 与 keep_dims 已弃用,并将在未来版本中删除。 更新说明: keep_dims 已弃用,请改用 keepdims
警告:tensorflow:来自/Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:1188:调用reduce_sum_v1(来自tensorflow.python.ops.math_ops ) 与 keep_dims 已弃用,并将在未来版本中删除。 更新说明: keep_dims 已弃用,请改用 keepdims
警告:tensorflow:来自/Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/tensorflow_core/python/ops/array_ops.py:1475:哪里(来自tensorflow.python.ops。 array_ops) 已弃用,并将在未来版本中删除。 更新说明: 使用2.0中的tf.where,与np.where的广播规则相同
警告:tensorflow:来自 /Users/Mask_RCNN/samples/lithic/mrcnn/model.py:553:名称 tf.random_shuffle 已弃用。请改用 tf.random.shuffle。
警告:tensorflow:来自 /Users/Mask_RCNN/samples/lithic/mrcnn/utils.py:202:名称 tf.log 已弃用。请改用 tf.math.log。
警告:tensorflow:来自 /Users/Mask_RCNN/samples/lithic/mrcnn/model.py:600:使用 Box_ind 调用crop_and_resize_v1(来自tensorflow.python.ops.image_ops_impl)已被弃用,并将在未来版本中删除。 更新说明: Box_ind 已弃用,请改用 Box_indices 加载权重 /Users/Mask_RCNN/mask_rcnn_coco.h5
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:153:名称 tf.get_default_session 已弃用。请改用 tf.compat.v1.get_default_session。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:158:名称 tf.ConfigProto 已弃用。请改用 tf.compat.v1.ConfigProto。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:163:名称 tf.Session 已弃用。请改用 tf.compat.v1.Session。
2021-03-05 19:07:53.441166: I tensorflow/core/platform/cpu_feature_guard.cc:142] 您的 cpu 支持该 TensorFlow 二进制文件未编译使用的指令:AVX2 FMA 2021-03-05 19:07:53.535883: I tensorflow/compiler/xla/service/service.cc:168] XLA 服务 0x7fe9f5f26f20 为平台主机初始化(这不保证会使用 XLA)。设备: 2021-03-05 19:07:53.535913:I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor 设备(0):主机,默认版本
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:333:名称 tf.global_variables 已弃用。请改用 tf.compat.v1.global_variables。
警告:tensorflow:来自 /Users/opt/anaconda3/envs/Mask_RCNN/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:341:名称 tf.variables_initializer 已弃用。请改用 tf.compat.v1.variables_initializer。
我的代码 -
"""
Mask R-CNN
copyright (c) 2018 Matterport,Inc.
Licensed under the MIT License (see LICENSE for details)
Written by Waleed Abdulla
------------------------------------------------------------
# Train a new model starting from pre-trained COCO weights
python3 python lithicRCNN.py train --dataset=/samples/lithic/dataset --weights=coco
# Resume training a model that you had trained earlier
python3 lithicRCNN.py train --dataset=/path/to/lithic/dataset --weights=last
# Train a new model starting from ImageNet weights
python3 lithicRCNN.py train --dataset=/path/to/lithic/dataset --weights=imagenet
# Apply color splash to an image
python3 lithicRCNN.py splash --weights=/path/to/weights/file.h5 --image=<URL or path to file>
# Apply color splash to video using the last weights you trained
python3 lithicRCNN.py splash --weights=last --video=<URL or path to file>
"""
import os
import sys
import json
import datetime
import numpy as np
import skimage.draw
# Root directory of the project
ROOT_DIR = os.path.abspath("/Users/Mask_RCNN/")
# Import Mask RCNN
sys.path.append(ROOT_DIR) # To find local version of the library
from mrcnn.config import Config
from mrcnn import model as modellib,utils
# Path to trained weights file
COCO_WEIGHTS_PATH = os.path.join(ROOT_DIR,"mask_rcnn_coco.h5")
# Directory to save logs and model checkpoints,if not provided
# through the command line argument --logs
DEFAULT_LOGS_DIR = os.path.join(ROOT_DIR,"logs")
############################################################
# Configurations
############################################################
class CustomConfig(Config):
"""Configuration for training on the toy dataset.
Derives from the base Config class and overrides some values.
"""
# Give the configuration a recognizable name
NAME = "scar"
# We use a GPU with 12GB memory,which can fit two images.
# Adjust down if you use a smaller GPU.
IMAGES_PER_GPU = 2
# Number of classes (including background)
NUM_CLASSES = 1 + 1 # Background + lithic
# Number of training steps per epoch
STEPS_PER_EPOCH = 100
# Skip detections with < 90% confidence
DETECTION_MIN_CONFIDENCE = 0.9
############################################################
# Dataset
############################################################
class CustomDataset(utils.Dataset):
def load_custom(self,dataset_dir,subset):
"""Load a subset of the lithic dataset.
dataset_dir: Root directory of the dataset.
subset: Subset to load: train or val
"""
# Add classes. We have only one class to add.
self.add_class("scar","scar")
# Train or validation dataset?
assert subset in ["train","val"]
dataset_dir = os.path.join(dataset_dir,subset)
# Load annotations
# VGG Image Annotator (up to version 1.6) saves each image in the form:
# { 'filename': '28503151_5b5b7ec140_b.jpg',# 'regions': {
# '0': {
# 'region_attributes': {},# 'shape_attributes': {
# 'all_points_x': [...],# 'all_points_y': [...],# 'name': 'polygon'}},# ... more regions ...
# },# 'size': 100202
# }
# We mostly care about the x and y coordinates of each region
# Note: In VIA 2.0,regions was changed from a dict to a list.
annotations = json.load(open(os.path.join(dataset_dir,"via_region_data.json")))
annotations = list(annotations.values()) # don't need the dict keys
# The VIA tool saves images in the JSON even if they don't have any
# annotations. Skip unannotated images.
annotations = [a for a in annotations if a['regions']]
# Add images
for a in annotations:
# Get the x,y coordinaets of points of the polygons that make up
# the outline of each object instance. These are stores in the
# shape_attributes (see json format above)
# The if condition is needed to support VIA versions 1.x and 2.x.
if type(a['regions']) is dict:
polygons = [r['shape_attributes'] for r in a['regions'].values()]
else:
polygons = [r['shape_attributes'] for r in a['regions']]
# load_mask() needs the image size to convert polygons to masks.
# Unfortunately,VIA doesn't include it in JSON,so we must read
# the image. This is only managable since the dataset is tiny.
image_path = os.path.join(dataset_dir,a['filename'])
image = skimage.io.imread(image_path)
height,width = image.shape[:2]
self.add_image(
"scar",image_id=a['filename'],# use file name as a unique image id
path=image_path,width=width,height=height,polygons=polygons)
def load_mask(self,image_id):
"""Generate instance masks for an image.
Returns:
masks: A bool array of shape [height,width,instance count] with
one mask per instance.
class_ids: a 1D array of class IDs of the instance masks.
"""
# If not a lithic dataset image,delegate to parent class.
image_info = self.image_info[image_id]
if image_info["source"] != "scar":
return super(self.__class__,self).load_mask(image_id)
# Convert polygons to a bitmap mask of shape
# [height,instance_count]
info = self.image_info[image_id]
mask = np.zeros([info["height"],info["width"],len(info["polygons"])],dtype=np.uint8)
for i,p in enumerate(info["polygons"]):
# Get indexes of pixels inside the polygon and set them to 1
rr,cc = skimage.draw.polygon(p['all_points_y'],p['all_points_x'])
mask[rr,cc,i] = 1
# Return mask,and array of class IDs of each instance. Since we have
# one class ID only,we return an array of 1s
return mask.astype(np.bool),np.ones([mask.shape[-1]],dtype=np.int32)
def image_reference(self,image_id):
"""Return the path of the image."""
info = self.image_info[image_id]
if info["source"] == "scar":
return info["path"]
else:
super(self.__class__,self).image_reference(image_id)
def train(model):
"""Train the model."""
# Training dataset.
dataset_train = CustomDataset()
dataset_train.load_custom(args.dataset,"train")
dataset_train.prepare()
# Validation dataset
dataset_val = CustomDataset()
dataset_val.load_custom(args.dataset,"val")
dataset_val.prepare()
# *** This training schedule is an example. Update to your needs ***
# Since we're using a very small dataset,and starting from
# COCO trained weights,we don't need to train too long. Also,# no need to train all layers,just the heads should do it.
print("Training network heads")
model.train(dataset_train,dataset_val,learning_rate=config.LEARNING_RATE,epochs=30,layers='heads')
def color_splash(image,mask):
"""Apply color splash effect.
image: RGB image [height,3]
mask: instance segmentation mask [height,instance count]
Returns result image.
"""
# Make a grayscale copy of the image. The grayscale copy still
# has 3 RGB channels,though.
gray = skimage.color.gray2rgb(skimage.color.rgb2gray(image)) * 255
# copy color pixels from the original color image where mask is set
if mask.shape[-1] > 0:
# We're treating all instances as one,so collapse the mask into one layer
mask = (np.sum(mask,-1,keepdims=True) >= 1)
splash = np.where(mask,image,gray).astype(np.uint8)
else:
splash = gray.astype(np.uint8)
return splash
def detect_and_color_splash(model,image_path=None,video_path=None):
assert image_path or video_path
# Image or video?
if image_path:
# Run model detection and generate the color splash effect
print("Running on {}".format(args.image))
# Read image
image = skimage.io.imread(args.image)
# Detect objects
r = model.detect([image],verbose=1)[0]
# Color splash
splash = color_splash(image,r['masks'])
# Save output
file_name = "splash_{:%Y%m%dT%H%M%s}.png".format(datetime.datetime.Now())
skimage.io.imsave(file_name,splash)
elif video_path:
import cv2
# Video capture
vcapture = cv2.VideoCapture(video_path)
width = int(vcapture.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(vcapture.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = vcapture.get(cv2.CAP_PROP_FPS)
# Define codec and create video writer
file_name = "splash_{:%Y%m%dT%H%M%s}.avi".format(datetime.datetime.Now())
vwriter = cv2.VideoWriter(file_name,cv2.VideoWriter_fourcc(*'MJPG'),fps,(width,height))
count = 0
success = True
while success:
print("frame: ",count)
# Read next image
success,image = vcapture.read()
if success:
# OpenCV returns images as BGR,convert to RGB
image = image[...,::-1]
# Detect objects
r = model.detect([image],verbose=0)[0]
# Color splash
splash = color_splash(image,r['masks'])
# RGB -> BGR to save image to video
splash = splash[...,::-1]
# Add image to video writer
vwriter.write(splash)
count += 1
vwriter.release()
print("Saved to ",file_name)
############################################################
# Training
############################################################
if __name__ == '__main__':
import argparse
# Parse command line arguments
parser = argparse.ArgumentParser(
description='Train Mask R-CNN to detect lithic scars.')
parser.add_argument("command",Metavar="<command>",help="'train' or 'splash'")
parser.add_argument('--dataset',required=False,Metavar="/path/to/lithic/dataset/",help='Directory of the Lithic dataset')
parser.add_argument('--weights',required=True,Metavar="/path/to/weights.h5",help="Path to weights .h5 file or 'coco'")
parser.add_argument('--logs',default=DEFAULT_LOGS_DIR,Metavar="/path/to/logs/",help='Logs and checkpoints directory (default=logs/)')
parser.add_argument('--image',Metavar="path or URL to image",help='Image to apply the color splash effect on')
parser.add_argument('--video',Metavar="path or URL to video",help='Video to apply the color splash effect on')
args = parser.parse_args()
# Validate arguments
if args.command == "train":
assert args.dataset,"Argument --dataset is required for training"
elif args.command == "splash":
assert args.image or args.video,\
"Provide --image or --video to apply color splash"
print("Weights: ",args.weights)
print("Dataset: ",args.dataset)
print("Logs: ",args.logs)
# Configurations
if args.command == "train":
config = CustomConfig()
else:
class InferenceConfig(CustomConfig):
# Set batch size to 1 since we'll be running inference on
# one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
GPU_COUNT = 1
IMAGES_PER_GPU = 1
config = InferenceConfig()
config.display()
# Create model
if args.command == "train":
model = modellib.MaskRCNN(mode="training",config=config,model_dir=args.logs)
else:
model = modellib.MaskRCNN(mode="inference",model_dir=args.logs)
# Select weights file to load
if args.weights.lower() == "coco":
weights_path = COCO_WEIGHTS_PATH
# Download weights file
if not os.path.exists(weights_path):
utils.download_trained_weights(weights_path)
elif args.weights.lower() == "last":
# Find last trained weights
weights_path = model.find_last()
elif args.weights.lower() == "imagenet":
# Start from ImageNet trained weights
weights_path = model.get_imagenet_weights()
else:
weights_path = args.weights
# Load weights
print("Loading weights ",weights_path)
if args.weights.lower() == "coco":
# Exclude the last layers because they require a matching
# number of classes
model.load_weights(weights_path,by_name=True,exclude=[
"mrcnn_class_logits","mrcnn_bBox_fc","mrcnn_bBox","mrcnn_mask"])
else:
model.load_weights(weights_path,by_name=True)
# # Train or evaluate
# if args.command == "train":
# train(model)
# elif args.command == "splash":
# detect_and_color_splash(model,image_path=args.image,# video_path=args.video)
# else:
# print("'{}' is not recognized. "
# "Use 'train' or 'splash'".format(args.command))
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)