Instance Segmentation
#####################

Instance segmentation determines the pixelwise mask for each object in an image. 
This tutorial will guide you through creating source codes for instance segmentation tasks
using the **cvtk** package,
building a model for segmenting objects, and performing inference on the constructed model.

Note that the **MMDetection** package is internally used in **cvtk** for instance segmentation.
Ensure that **MMDetection** is installed correctly without any errors before using the **cvtk** package.
Additionally, the source code for instance segmentation
is the same as the source code for object detection,
except for the following sections.

- Regarding the network architecture, Mask RCNN (``mask-rcnn_r101_fpn_1x_coco``) is used
  for instance segmentation and Faster RCNN (``faster-rcnn_r101_fpn_1x_coco``) is used for object detection.
- Regarding the annotations used during training, instance segmentation uses segmentation coordinates,
  while object detection uses bounding boxes coordinates.


Source Code Preparation
***********************

To generate Python source code,
use the ``cvtk create`` command.
For those new to programming or deep learning,
it is recommended to run the following command to generate simple source code.
The code generated by this command contains only the essential processes,
with all complex processes imported from the **cvtk** package.
This makes the source code easy to read and helps in
understanding the flow of deep learning for beginners.


.. code-block:: sh
    
    cvtk create --script segm.py --task segm


By default, Mask RCNN (``mask-rcnn_r101_fpn_1x_coco``) is used.
Users can change the ``'mask-rcnn_r101_fpn_1x_coco'`` part to any other network architecture
by replacing it with another string in the generated source code.
Available network architectures can be found on the MMDet GitHub repository
(e.g., `mmdetection.configs <https://github.com/open-mmlab/mmdetection/tree/main/configs>`_)
or search by using the ``mim search`` command (e.g., ``mim search mmdet --model "r-cnn"``).


Same to the object detection, using the command ``cvtk create`` with the argument ``--vanilla``
can generate source code that uses only the **MMDetection** package functions.


.. code-block:: sh
    
    cvtk create --script segm.py --task segm --vanilla


Model Training and Validation
*****************************

To train the model, open the source code generated above and execute it by providing training,
validation, and test data to the input of the ``train`` function.

Alternatively, the source code can be executed directly from the command line as follows:

.. code-block:: sh

    python det.py train \
        --label ./data/strawberry/label.txt \
        --train ./data/strawberry/train/segm.json \
        --valid ./data/strawberry/valid/segm.json \
        --test ./data/strawberry/test/segm.json \
        --output_weights ./outputs/strawberry.pth


The weights of the trained model will be saved in :file:`strawberry.pth`,
and the loss and accuracy data during the training process will be saved in
:file:`strawberry.train_stats.train.txt` and :file:`strawberry.train_stats.valid.txt`
and the figures based on the two file.
Both files are tab-separated files as follows:


:file:`strawberry.train_stats.train.txt`

::

    epoch	lr	data_time	loss	loss_rpn_cls	loss_rpn_bbox	loss_cls	acc	loss_bbox	loss_mask	time	memory
    1	0.00118	0.03453	1.84487	0.03162	0.01490	0.59783	88.37890625	0.3763710225621859	0.8241304568946362	0.4096731980641683	5721.0
    2	0.00238	0.01690	1.01754	0.01787	0.01174	0.31824	83.984375	0.4193130740523338	0.250374620705843	0.36697773933410643	5686.0
    3	0.00353	0.01563	0.72365	0.00546	0.01157	0.21473	87.59765625	0.3275810395181179	0.1643025816977024	0.35318960666656496	5757.0
    4	0.00478	0.01308	0.51533	0.00525	0.01162	0.15927	98.33984375	0.194721964225173	0.14444936953485013	0.37441123962402345	5804.0
    5	0.00598	0.01276	0.44866	0.00665	0.01034	0.12237	95.3125	0.17035995483398436	0.13892668940126895	0.36310056209564207	5728.0


:file:`strawberry.train_stats.train.png`

.. image:: ../_static/strawberry.train_stats.train.segm.png
    :width: 70%
    :align: center


:file:`strawberry.train_stats.valid.txt`

::

    coco/bbox_mAP	coco/bbox_mAP_50	coco/bbox_mAP_75	coco/bbox_mAP_s	coco/bbox_mAP_m	coco/bbox_mAP_l	coco/segm_mAP	coco/segm_mAP_50	coco/segm_mAP_75	coco/segm_mAP_s	coco/segm_mAP_m	coco/segm_mAP_l	data_time	time	step
    0.345	0.507	0.399	-1.0	-1.0	0.345	0.412	0.507	0.466	-1.0	-1.0	0.413	0.1388627529144287	0.86514892578125	1
    0.352	0.614	0.345	-1.0	-1.0	0.352	0.5	0.614	0.581	-1.0	-1.0	0.501	0.012227217356363932	0.40663444995880127	2
    0.579	0.748	0.693	-1.0	-1.0	0.582	0.643	0.748	0.748	-1.0	-1.0	0.659	0.01785115400950114	0.23407896359761557	3
    0.642	0.785	0.785	-1.0	-1.0	0.642	0.72	0.785	0.785	-1.0	-1.0	0.75	0.018213987350463867	0.2121752897898356	4
    0.643	0.829	0.829	-1.0	-1.0	0.643	0.718	0.829	0.829	-1.0	-1.0	0.725	0.01665182908376058	0.18859827518463135	5


:file:`strawberry.train_stats.valid.png`

.. image:: ../_static/strawberry.train_stats.valid.segm.png
    :width: 70%
    :align: center


Additionally, if the test data is provided,
the model will be evaluated using the test data.
The inference results of test data are stored in workspace (:file:`strawberry` directory)
with the name :file:`test_outputs.coco.json` in COOC format file.
The test performance metrics (e.g., mAP) will be saved in :file:`strawberry.test_stats.json`
in JSON format as follows.
The ``stats`` element indicates the mean of metrics of all classes,
while the metrics for each class are stored in ``class_stats`` elements.

::

    {
        "stats": {
            "AP@[0.50:0.95|all|100]": 0.8671538582429673,
            "AP@[0.50|all|1000]": 0.9365079365079365,
            "AP@[0.75|all|1000]": 0.9365079365079365,
            ...
            "AP@[0.50:0.95|large|1000]": 0.8671538582429673,
            "AR@[0.50:0.95|all|100]": 0.4738095238095238,
            "AR@[0.50:0.95|all|300]": 0.9029761904761905,
        },
        "class_stats": {
            "flower": {
                "AP@[0.50:0.95|all|100]": 0.9252475247524753,
                "AP@[0.50|all|1000]": 1.0,
                "AP@[0.75|all|1000]": 1.0,
                ...
            },
            "green_fruit": {
                "AP@[0.50:0.95|all|100]": 0.9665016501650165,
                "AP@[0.50|all|1000]": 1.0,
                "AP@[0.75|all|1000]": 1.0,
                ...
            },
            "red_fruit": {
                "AP@[0.50:0.95|all|100]": 0.7097123998114098,
                "AP@[0.50|all|1000]": 0.8095238095238095,
                "AP@[0.75|all|1000]": 0.8095238095238095,
                ...
            }
        }
    }


Inference
*********

To perform inference using the constructed model,
refer to the ``inference`` function in the source code.

Alternatively, it can also be executed directly from the command line as follows:

.. code-block:: sh

    python segm.py inference \
        --label ./data/fruits/label.txt \
        --data ./data/fruits/test.txt \
        --model_weights ./outputs/strawberry.pth \
        --output ./outputs/inference_results


The inference result of each image
(i.e., image with predicted bounding boxes)
will be saved in :file:`inference_results` directory.
Additionanly, a COCO format file containing all predicted annotations
will be saved in :file:`instances.json`

Example of outputed images are:


.. image:: ../_static/0de80884.segm.jpg
    :width: 70%
    :align: center


.. image:: ../_static/7f7737de.segm.jpg
    :width: 70%
    :align: center