DeepLab is a series of image semantic segmentation models, whose latest version, i.e. v3+, proves to be the state-of-art. Its major contribution is the use of atrous spatial pyramid pooling (ASPP) operation at the end of the encoder. While the model works extremely well, its open sourced code is hard to read. Here we re-implemented DeepLab v3, the earlier version of v3+, which only additionally employs the decoder architecture, in a much simpler and understandable way. This is a collaborative project developed by me and Shengjie Lin from Toyota Technological Institute at Chicago.
optional arguments: -h, --help show this help message and exit --network_backbone NETWORK_BACKBONE Network backbones: resnet_50, resnet_101, mobilenet_1.0_224. Default: resnet_101 --pre_trained_model PRE_TRAINED_MODEL Pretrained model directory --trainset_filename TRAINSET_FILENAME Train dataset filename --valset_filename VALSET_FILENAME Validation dataset filename --images_dir IMAGES_DIR Images directory --labels_dir LABELS_DIR Labels directory --trainset_augmented_filename TRAINSET_AUGMENTED_FILENAME Train augmented dataset filename --images_augmented_dir IMAGES_AUGMENTED_DIR Images augmented directory --labels_augmented_dir LABELS_AUGMENTED_DIR Labels augmented directory --model_dir MODEL_DIR Trained model saving directory --log_dir LOG_DIR TensorBoard log directory --random_seed RANDOM_SEED Random seed for model training.
For simplicity, please run the following command in terminal:
1
$ python train.py
With learning rate of 1e-5, the mIOU could be greater 0.7 after 20 epochs, which is comparable to the test statistics of DeepLab v3 in the publication.
Demos
To show some demos, please run the following command in terminal: