如何在 Detectron2 中使用 DefaultTrainer 保存模型?

问题描述

如何使用 DefaultTrainer 在 Detectron2 中保存检查点? 这是我的设置:

reduce_mean

我收到错误

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))

cfg.DATASETS.TRAIN = (DatasetLabels.TRAIN,)
cfg.DATASETS.TEST = ()
cfg.DataLoader.NUM_WORKERS = 2
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 273  # Number of output classes

cfg.OUTPUT_DIR = "outputs"
os.makedirs(cfg.OUTPUT_DIR,exist_ok=True)
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.soLVER.ims_PER_BATCH = 2
cfg.soLVER.BASE_LR = 0.00025#0.00025  # Learning Rate
cfg.soLVER.MAX_ITER = 10000  # 20000 MAx Iterations
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128  # Batch Size

trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
trainer.train()


# Save the model
from detectron2.checkpoint import DetectionCheckpointer,Checkpointer
checkpointer = DetectionCheckpointer(trainer,save_dir=cfg.OUTPUT_DIR)
checkpointer.save("mymodel_0")  

文档:https://detectron2.readthedocs.io/en/latest/modules/checkpoint.html

解决方法

checkpointer = DetectionCheckpointer(trainer.model,save_dir=cfg.OUTPUT_DIR)

是要走的路。

或者:

torch.save(trainer.model.state_dict(),os.path.join(cfg.OUTPUT_DIR,"mymodel.pth"))