Project Configuration
GinJinn2 projects are configured using files in YAML format.
Each GinJinn2 project (folder) must contain such a file named ginjinn_config.yaml.
For a project generated by ginjinn new, it might look like this:
project_dir: "/home/ginjinn_user/my_project"
task: "bbox-detection"
# Input data options
input:
type: "COCO" # or "PVOC"
training:
annotation_path: "/home/ginjinn_user/my_dataset/train/annotations.json"
image_path: "/home/ginjinn_user/my_dataset/train/images"
validation:
annotation_path: "/home/ginjinn_user/my_dataset/val/annotations.json"
image_path: "/home/ginjinn_user/my_dataset/val/images"
test:
annotation_path: "/home/ginjinn_user/my_dataset/val/annotations.json"
image_path: "/home/ginjinn_user/my_dataset/val/images"
# Model options
model:
name: "faster_rcnn_R_50_FPN_1x"
weights: "pretrained"
# additional model options
model_parameters:
# anchor generator options
anchor_generator:
aspect_ratios:
- - 0.5
- 1.0
- 2.0
# Options for model training
training:
learning_rate: 0.00125
batch_size: 1
max_iter: 5000
eval_period: 250
checkpoint_period: 2500
# Options for image augmentation
augmentation:
- horizontal_flip:
probability: 0.25
# Additional options
options:
n_threads: 1
The following sections will describe each part of the config file in detail.
General Configuration
The general configuration comprises the project directory and the model task (i.e. bounding-box detection or instance segmentation).
Both entries will automatically be set when the project is generated using ginjinn new.
project_dir
Absolute path to the project directory. If the project is moved to another disk location, this configuration must be updated to the new location.
Example:
project_dir: "/home/ginjinn_user/my_project"
task
Project task. Either “bbox-detection” or “instance-segmentation” for bounding-box detection and instance segmentation, respectively.
Example:
task: "instance-segmentation"
input Configuration
The input configuration specifies the input type (i.e. annotation format) and the locations of training, and optionally (but highly recommended) validation and test datasets.
These entries will be automatically set if a new project is initialized using ginjinn new with the -d option.
Example:
input:
type: "COCO"
training:
...
...
type
Dataset type. Either “COCO” or “PVOC” for COCO and PASCAL VOC datasets, respectively. See Overview for a brief description of the dataset types. We recommend working with COCO datasets whenever possible.
Example:
type: "COCO"
training, validation, test
Paths to training, validation, and test datasets.
Each entry comprises an annotation_path and an image_path field, specifying the location of the annotations and images on disk, respectively.
Only the training entry is required, but we strongly suggest to supply at the very least one of the other datasets (validation/test).
A training-validation-test split can be generated using ginjinn split.
Example:
training:
annotation_path: "/home/ginjinn_user/my_dataset/train/annotations.json"
image_path: "/home/ginjinn_user/my_dataset/train/images"
validation:
annotation_path: "/home/ginjinn_user/my_dataset/val/annotations.json"
image_path: "/home/ginjinn_user/my_dataset/val/images"
training Configuration
This entry comprises several settings for training the model, like, for example, the number of training iterations and the evaluation period.
Example:
training:
max_iter: 1000
eval_period: 250
...
max_iter
Number of training iterations.
Example:
max_iter: 2500
eval_period
Number of training iterations between evaluations of the validation dataset.
A value of 250, for example, means that every 250 training iterations the whole validation dataset will be evaluated and the results will be written to metrics.json, metrics.pdf, and events.out.*.
Setting a low value (high frequency of evaluations) may be computationally expensive, depending on the size of the validation dataset.
Example:
eval_period: 500
checkpoint_period
Number of training iterations between saving model checkpoints. A value of 500, for example, means that every 500 training iterations the model weights will be saved to “model_*.pth” files. Checkpoints are useful if a model has been trained too long, leading to overfitting (see Early Stopping). Note: A model checkpoint is typically several hundred megabytes in size, hence setting this value too low might lead to storage issues.
Example:
checkpoint_period: 1000
batch_size
The number of images to be processed per training iteration.
Depending on the memory size of your GPU, you may want to specify a value larger than one in order to speed up the training.
When increasing the batch size, it may also be advisible to increase the learning_rate accordingly.
Example:
batch_size: 2
learning_rate
A factor determining how strongly the model weights are adjusted per training iteration.
A high value might cause the model to diverge while a very low value might lead to slow learning.
The default values should already be sensible for the provided models.
If the batch_size is changed, however, the learning_rate should be adjusted proportionally.
For example, if batch_size is set to 4, the learning_rate should be multiplied by 4.
Example:
learning_rate: 0.00125
warmup_iter
Number of iterations until learning_rate is reached.
The model training starts with a learning rate value of learning_rate/warmup_iter, which increases up to learning_rate after warmup_iter iterations.
This can counter an early divergence of the model when using random weight initialization.
Typically, this value does not need to be changed.
Example:
warmup_iter: 1000
momentum
Momentum parameter for the Stochastic Gradient Descent optimizer. Typically, this value does not need to be changed.
model Configuration
This entry comprises all configurations related to the used model.
Example:
model:
name: "faster_rcnn_R_50_FPN_1x"
...
name
Name of the model to be used.
For a list of available models see ginjinn new -h.
If the project is initialized using ginjinn new with the -t option, the name will be already set.
Example:
name: "faster_rcnn_R_50_FPN_1x"
weights
Weights to use for initialization.
- One of
"", meaning random initialization"pretrained", meaning pretrained weights (“Transfer Learning”)path to a weights file (“.pth”) to be used for initialization
Example:
weights: "pretrained"
model_parameters
Additional model-specific parameters.
anchor_generator
Anchor generator options. Modifying the anchor sizes and aspect ratios to match the expected objects might increase model performance.
Relevant for: faster_rcnn_*, mask_rcnn_*
- Entries:
sizes: anchor sizesanspect_ratios: anchor aspect ratiosangles: anchor rotation angles
Example:
anchor_generator:
sizes:
- - 32
- - 64
aspect_ratios:
- - 0.5
- 1.0
rpn
Region Proposal Network options.
Relevant for: faster_rcnn_*, mask_rcnn_*
- Entries:
iou_thresholds: Intersection over Union thresholdsbatch_size_per_image: number of region proposals per image
Example:
rpn:
iou_thresholds:
- 0.3
- 0.7
batch_size_per_image: 256
roi_heads
Region of Interest Heads options.
Relevant for: faster_rcnn_*, mask_rcnn_*
- Entries:
iou_thresholds: Intersection over Union thresholdsbatch_size_per_image: number of RoIs per image
Example:
roi_heads:
iou_thresholds:
- 0.5
batch_size_per_image: 512
augmentation Configuration
This entry comprises an arbitrary number of data augmentation configurations. Adding sensible data augmentation can artificially increase the available training data and thus improve model performance on new data (“Generalization”).
Example:
augmentations:
- horizontal_flip:
probability: 0.25
- vertical_flip:
probability: 0.25
...
horizontal_flip
Randomly apply a horizontal flip to images before training.
- Entries:
probability: probability of applying the augmentation
Example:
horizontal_flip:
probability: 0.25
vertical_flip
Randomly apply a vertical flip to images before training.
- Entries:
probability: probability of applying the augmentation
Example:
vertical_flip:
probability: 0.25
brightness
Randomly apply a brightness augmentation to images before training.
- Entries:
probability: probability of applying the augmentationbrightness_min: minimum relative brightnessbrightness_max: maximum relative brightness
Example:
brightness:
probability: 0.25
brightness_min: 0.8
brightness_max: 1.2
contrast
Randomly apply a contrast augmentation to images before training.
- Entries:
probability: probability of applying the augmentationcontrast_min: minimum relative contrastcontrast_max: maximum relative contrast
Example:
contrast:
probability: 0.25
contrast_min: 0.8
contrast_max: 1.2
saturation
Randomly apply a saturation augmentation to images before training.
- Entries:
probability: probability of applying the augmentationsaturation_min: minimum relative saturationsaturation_max: maximum relative saturation
Example:
saturation:
probability: 0.25
saturation_min: 0.8
saturation_max: 1.2
rotation_range
Randomly apply a rotation augmentation in the specified range to images before training.
- Entries:
probability: probability of applying the augmentationexpand: whether the image should be resized to fit the rotated image. Iffalse, the image will be cropped.angle_min: minimum rotation angleangle_max: maximum rotation angle
Example:
rotation_range:
probability: 0.25
expand: true
angle_min: -30
angle_max: 30
rotation_choice
Randomly apply a rotation augmentation with one of the specified angles to images before training.
- Entries:
probability: probability of applying the augmentationexpand: whether the image should be resized to fit the rotated image. Iffalse, the image will be cropped.angles: rotation angles
Example:
rotation_choice:
probability: 0.25
expand: true
angles:
- -45
- -30
- 30
- 45
crop_relative
Randomly use a relative crop of the original image for training.
- Entries:
probability: probability of applying the augmentationwidth: relative width of the cropheight: relative height of the crop
Example:
crop_relative:
probability: 0.25
width: 0.8
height: 0.7
crop_absolute
Randomly use a crop of the original image for training.
- Entries:
probability: probability of applying the augmentationwidth: width of the crop in pixelsheight: height of the crop in pixels
Example:
crop_absolute:
probability: 0.25
width: 512
height: 512
options Configuration
Additional options.
Example:
options:
n_threads: 4
resume: false
device: "cuda:0"
n_threads
Number of threads (“cores”) to use for data loading and augmentation.
Example:
n_threads: 2
resume
Whether to resume training or start fresh when calling ginjinn train for a GinJinn2 project that was already trained.
Example:
resume: false
device
Computation device to use for model training. It is only sensible to change this if you are working on a multi-GPU system.
Example:
device: "cuda:0"
detectron Configuration
Additional options that are directly converted to Detectron2 configurations. This entry opens up advanced model configurations that are not directly supported by GinJinn2.
For example
detectron:
SOLVER:
WARMUP_ITERS: 1000
is equivalent to the Detectron2 configuration
_C.SOLVER.WARMUP_ITERS = 1000