Dataset Annotation

Image annotations, in the context of object detection and segmentation, are metadata describing the localization and category of objects of interest. Those can be represented by bounding boxes and object polygons (or sometimes pixel masks), respectively. Annotated images are required for model training and evaluation since these require the presence of a so-called ground-truth, i.e., known bounding boxes or segmentations. GinJinn2 supports the PASCAL VOC and COCO image annotation formats. Note that PASCAL VOC is only supported for bounding-box detection and some of the GinJinn2 utilities are restricted to use with COCO datasets. Hence, we recommend preparing annotations in COCO format, if possible.

There are several freely available annotation tools, most of which support at least one of those formats. The following is a non-exhaustive list of software for this purpose:

We recommend the use of CVAT, because this software allows to export COCO annotations which are consistent with the COCO dataset specification and compatible with many other tools working with COCO formatted datasets. In addition, COCO datasets generated by GinJinn2’s predict command, can be imported by CVAT, for example, to perform manual corrections/refinement or to allow active learning. Since CVAT supports several other popular annotation formats, it provides good interoperability with other tools, as well.

Annotation with CVAT

To install CVAT, please follow the official installation guide.

The following video shows how to get from a folder with images to an annotated dataset ready to use with GinJinn2.

Orientation issues

A common problem is that different programs do not handle existing EXIF metadata of images consistently. In particular, CVAT ignores orientation information stored in the EXIF data whereas GinJinn2 rotates the images accordingly. When using CVAT-annotated images as input to GinJinn2, this can lead to misplaced bounding boxes or segmentations.

There a several simple ways to prevent this problem. Please remember to backup your data before applying one of the commands below because they will overwrite the original images! On Debian-based systems, both ImageMagick and ExifTool are usually available from the distribution repositories and can, e.g., be installed with sudo apt update && apt install exiftool imagemagick.

  1. If you want to keep your images in the orientation specified by their EXIF metadata (which typically depends on how the camera was held), you could transform the images in a way such that the orientation tag becomes irrelevant (i.e., either “undefined” or “1”). This can, for example, be done with ImageMagick:

    for f in images/*jpg; do
        convert "$f" -auto-orient "$f"
    done
    

    The above shell command assumes that your images are in the “images” directory and have the filename extension “jpg”.

    Note: This solution is only useful before doing annotations with CVAT. If you have already annotated images and encounter orientation problems in GinJinn2, these can be fixed by the approach below.

  2. If you want to keep your images as displayed by CVAT, you could simply delete the orientation information stored in the EXIF metadata. This can, for instance, be done with ExifTool:

    exiftool -orientation#= -overwrite_original images/*jpg
    

    As a more radical approach, you could also remove the complete EXIF metadata:

    exiftool -EXIF= -overwrite_original images/*jpg
    

    Again, the above shell commands assume that your images are in the “images” directory and have the filename extension “jpg”.