Usage

Image Annotating

Before start, prepare annotation data to train a classifier. You should annotate at least two images; one for training and another for validating the model. The annotation shape has to be rectangular. Since vodet uses the sliding-window method to detect the objects, very small sizes of rectangular relative to the size of the image will cause a very slow speed of detection. You should also make margins a little bit larger than the silhouettes of objects for better classification accuracy. You can use either labelme or VoTT. Because labelme supports the zooming of images, I recommend you to use it, especially for high-resolution images. _images/labelme.png A screen shot of labelme’s user interface

Setting up data directories

The data directory should have three subdirectories: train, validation, and unlabelled. The train and validation directory should have source and labels subdirectory, that contain source images and label data respectively. The unlabelled directory should only have source directory. The label data should be VoTT’s CSV-export (single CSV file) or labelme’s JSON files.

Notice: All of the images you want to use in detection should be used in model training. Just push them all in the unlabelled directory, except for labeled images.

Directory structure example

.
├── train
│   ├── labels
│   │   └── IMAG0827.json
│   └── source
│       └── IMAG0827.JPG
├── unlabelled
│   └── source
│       ├── IMAG0417.JPG
│       ├── IMAG0441.JPG
│       ├── IMAG0463.JPG
│       ├── IMAG0485.JPG
.               .
.               .
.               .
│       └── IMAG1735.JPG
└── validation
    ├── labels
    │   └── IMAG0986.json
    └── source
        └── IMAG0986.JPG

Creating an GMVAE instance

from vodet.gmvae import GMVAE
data_dirs = {
    "train" : path_for_your_train_directory,
    "validation" : path_for_your_validation_directory,
    "unlabelled" : path_for_your_unlabelled_directory
}
gmvae = GMVAE(data_dirs)

for example,

from vodet.gmvae import GMVAE
data_dirs = {
    "train" : "./train/",
    "validation" : "./validation/,
    "unlabelled" : "./unlabelled/"
}
gmvae = GMVAE(data_dirs)

Generating patch images

To train the GMVAE classifier model, first, we separate source images into patches with labels based on label data. With train and validation images, vodet read the label data created by annotation tools. A patch that intersects with rectangular annotations will be labeled as the label name (e.g. flower), otherwise others. With unlabelled images, sliding-windows crop images into patches. The patch sizes are randomly selected from that of train label data.

gmvae.set_patches("labelme") # for labelme
gmvae.set_patches("VoTT") # for VoTT

Then patches directory will be created inside the train, validation, unlabelled directory.

.
├── train
│   ├── labels
│   ├── patches
│   │   ├── flower
│   │   └── other
│   └── source
├── unlabelled
│   ├── patches
│   │   └── unlabelled
│   └── source
└── validation
    ├── labels
    ├── patches
    │   ├── flower
    │   └── other
    └── source

Preparing Dataloaders

Next, prepare torch.utils.dataloader with batch size and torchvision.transforms.

Important Notice

  • At least, transforms.Resize((24,24)) and transforms.ToTensor() is required.

  • The size of transforms.Resize() must be (24,24).

from torchvision import transforms
transform = \
    {"labelled":transforms.Compose(
        [transforms.Resize((24,24)),
        transforms.RandomHorizontalFlip(),
        #transforms.ColorJitter(brightness=0.5,contrast=0.3),
        transforms.ToTensor()
        ]),
    "unlabelled":transforms.Compose(
        [transforms.Resize((24,24)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor()
        ]),
    "validation":transforms.Compose(
        [transforms.Resize((24,24)),
        transforms.ToTensor()
        ])
    }
    
gmvae.set_dataloaders(batch_size=128, transforms=transform)

Model setting

gmvae.set_model(z_dim=8, device="cuda:0")

Then the model structure will be printed. For more details, please refer to the document of pixyz

Model:
Distributions (for training):
  p(x|z), q(z|x,y), p(y|x), p_{prior}(z|y)
Loss function:
  mean \left(- 32.0 \log p(y|x) \right) - mean \left(\mathbb{E}_{q(z,y_{u}|x_{u})} \left[\log p(z,x_{u}|y_{u}) - \log q(z,y_{u}|x_{u}) \right] \right) - mean \left(\mathbb{E}_{q(z|x,y)} \left[\log p(x,z|y) - \log q(z|x,y) \right] \right)
Optimizer:
  Adam (
  Parameter Group 0
      amsgrad: False
      betas: (0.9, 0.999)
      eps: 1e-08
      lr: 0.001
      weight_decay: 0
  )

Training

gmvae.train(epochs=50, precision_th=95.0)

While training, the GMVAE instance automatically saves the model parameters, as its attribute, with the lowest test loss. You can set the threshold of test precision to do this with precision_th.

Metrics of each epoch will be printed.

100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 31/31 [00:01<00:00, 15.90it/s]
Epoch: 1 Train loss: 666.0989
Test Loss: tensor(0.3726, device='cuda:0') Test Recall: {'flower': 100.0, 'other': 90.0} Test Precision: {'flower': 89.74358974358974, 'other': 100.0}

You can also inspect the training run with tensorboardX The runs directory that contains the data of past runs as subdirectories will be created.

├── runs
│   └── vodet_gmvae_20210201_14:08:50
├── source
├── train
│   ├── labels
│   ├── patches
│   │   ├── flower
│   │   └── other
│   └── source
├── unlabelled
│   ├── patches
│   │   └── unlabelled
│   └── source
└── validation
    ├── labels
    ├── patches
    │   ├── flower
    │   └── other
    └── source

In this case, you can start tensorboard by running below in your terminal.

tensorboard --logdir runs/vodet_gmvae_20210201_14:08:50/

_images/tensorboard1.png _images/tensorboard2.png

Detection

First, creat a classifier instance.

d = gmvae.detector(label_type="labelme", conf_th=0.99, step_ratio = 0.5, iou_th=0.05)
  • label_type: “labelme” or “VoTT”

  • conf_th: Confidence threshold

  • step_ratio: The step size of sliding-windows relative to the size of them

  • iou_th: The threshold of Intersection over Union

The detector object can perform detection for either single image or multiple images in a directory.

result_dict = d.detect_img("image_path", "out_path") # Single image
result_df = d.detect_dir("in_directory", "out_directory") # Multiple images in a directory

for example,

result_dict = d.detect_img("./test_data/test.jpg", "test_out.png") # Single image
result_df = d.detect_dir("./test_data/", "./test_out/") # Multiple images in a directory

Both of the function returns the object names and detected numbers, and also draw result image(s) in given path/directory. _images/2013082612.jpg

Utility functions

After running detection, you can plot detected results along time. First, you should prepare a data frame with the image and date column that contains the file name and shooting date of each image. You can generate this automatically by using vodet.utils.exif_date() like below.

# Assume that the "source/" directory contains the original images with EXIF meta data.
from vodet.utils import date_df = exif_date("source/")

Then pass it to the detector instance.

resutl_df = d.draw_barplot(date_df)

_images/detected.png

Acknowledgements

All the example photographs of yellow flowers (Solidago gigantea) are taken at Tomakomai Flux Research Site in Tomakomai National Forest, Hokkaido, Japan, 2013 summer.