cvtk.ml.torchutils

class cvtk.ml.torchutils.DataTransform(shape: int | tuple[int, int], is_train=False)[source]

Pipeline for preprocessing images

The class composes several fundamental transforms for image preprocessing and converts them to a torchvision.transforms.Compose instance. It is intended for use by beginners. If user wants to customize their own image preprocessing pipeline, it is recommended to use torchvision.transforms.Compose directly.

Parameters:

shape – The resolution of preprocessed images.
is_train – Generate pipeline for trianing if True, otherwise for inference. Pipeline for training includes random cropping, flipping, and rotation; pipeline for inference only includes resizing and normalization.

Examples

>>> from cvtk.ml.torchutils import DataTransform
>>>
>>> transform_train = DataTransform(224, is_train=True)
>>> print(transform_train.pipeline)
>>>
>>> transform_inference = DataTransform(224)
>>> print(transforms_inference.pipeline)

class cvtk.ml.torchutils.Dataset(datalabel, dataset: str | list | tuple, transform: Compose | DataTransform | None = None, upsampling: bool = False)[source]

A class to manupulate image data for training and inference

Dataset is a class that generates a dataset for training or testing with PyTorch. It loads images from a directory (the subdirectories are recursively loaded), a list, a tuple, or a tab-separated (TSV) file. For the TSV file, the first column is recognized as the the path to the image and the second column as correct label if present. For traning, validation, and test, data should be input with TSV files containing two columns.

Imbalanced data will make the model less sensitive to minority classes with small sample sizes compared to normal data for balanced data. Therefore, if models are created without properly addressing imbalanced data, problems will arise in terms of accuracy, computational complexity, etc. It is best to have balanced data during the data collection phase. However, if it is difficult to obtain balanced data in some situations, upsampling is used so that the samples in the minority class are equal in number to those in the major class. In this class, upsampling is performed by specifying upsampling=TRUE.

Parameters:

datalabel – A DataLabel instance. This datalabel is used to convert class labels to integers.
dataset – A path to a directory, a list, a tuple, or a TSV file.
transform – A transform pipeline of image processing.
balance_train – If True, the number of images in each class is balanced

Examples

>>> from cvtk.ml import DataLabel
>>> from cvtk.ml.torchutils import Dataset, DataTransform
>>>
>>> datalabel = DataLabel(['leaf', 'flower', 'root'])
>>>
>>> transform = DataTransform(224, is_train=True)
>>>
>>> dataset = Dataset(datalabel, 'train.txt', transform)
>>> print(len(dataset))
100
>>> img, label = dataset[0]
>>> print(img.shape)
>>> print(label)

class cvtk.ml.torchutils.DataLoader(*args, **kwargs)[source]

Create dataloader to manage data for training and inference

This class simply creates a torch.utils.data.DataLoader instance to manage data for training and inference.

Parameters:

dataset (cvtk.ml.torchutils.DataSet) – A dataset for training and inference.
batch_size (int) – A batch size for training and inference.
num_workers (int) – The number of workers for data loading.
shuffle (bool) – If True, the data is shuffled at every epoch.

Returns:

A torch.utils.data.DataLoader instance.

Examples

>>> from cvtk.ml
>>> from cvtk.ml import DataLabel
>>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader
>>>
>>> datalabel = DataLabel(['leaf', 'flower', 'root'])
>>> transform = DataTransform(224, is_train=True)
>>> dataset = Dataset(datalabel, 'train.txt', transform)
>>> dataloader = DataLoader(dataset, batch_size=32, num_workers=4)
>>>

class cvtk.ml.torchutils.ModuleCore(datalabel, model, weights=None, workspace=None)[source]

A class provides training and inference functions for a classification model using PyTorch

ModuleCore is a class that provides training and inference functions for a classification model.

Parameters:

datalabel (str|list|tuple|DataLabel) – A DataLabel instance containing class labels. If string (of file path), list, tuple is given, it is converted to a DataLabel instance.
model (str|torch.nn.Module) – A string to specify a model or a torch.nn.Module instance.
weights (str) – A file path to model weights.
workspace (str) – A temporary directory path to save intermediate checkpoints and training logs. If not given, the intermediate results are not saved.

device

A device to run the model. Default is ‘cuda’ if available, otherwise ‘cpu’.

Type:: str

datalabel

A DataLabel instance containing class labels.

Type:: DataLabel

model

A model of torch.nn.Module instance.

Type:: torch.nn.Module

workspace

A temporary directory path.

Type:: str

train_stats

A dictionary to save training statistics

Type:: dict

test_stats

A dictionary to save test statistics

Type:: dict

Examples

>>> import torch
>>> import torchvision
>>> from cvtk.ml.torchutils import ModuleCore
>>>
>>> datalabel = ['leaf', 'flower', 'root']
>>> m = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT')
>>>
>>> datalabel = 'class_label.txt'
>>> m = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT')

train(train, valid=None, test=None, epoch=20, optimizer=None, criterion=None, resume=False)[source]

Train the model with the provided dataloaders

Train the model with the provided dataloaders. The training statistics are saved in the temporary directory.

Parameters:

train (torch.utils.data.DataLoader) – A dataloader for training.
valid (torch.utils.data.DataLoader) – A dataloader for validation.
test (torch.utils.data.DataLoader) – A dataloader for testing.
epoch (int) – The number of epochs to train the model.
optimizer (torch.optim.Optimizer|None) – An optimizer for training. Default is None and torch.optim.SGD is used.
criterion (torch.nn.Module|None) – A loss function for training. Default is None and torch.nn.CrossEntropyLoss is used.
resume (bool) – If True, the training resumes from the last checkpoint which is saved in the temporary directory specified with temp_dirpath.

Examples

>>> import torch
>>> from cvtk.ml import DataLabel
>>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader, ModuleCore
>>>
>>> datalabel = DataLabel(['leaf', 'flower', 'root'])
>>>
>>> model = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT')
>>>
>>> # train dataset
>>> transforms_train = DataTransform(600, is_train=True)
>>> dataset_train = Dataset(datalabel, 'train.txt', transforms_train)
>>> dataloader_train = DataLoaders(dataset_train, batch_size=32, num_workers=4)
>>> # valid dataset
>>> transforms_valid = DataTransform(600, is_train=False)
>>> dataset_valid = Dataset(datalabel, 'valid.txt, transforms_valid)
>>> dataloader_valid = DataLoader(dataset_valid, batch_size=32, num_workers=4)
>>>
>>> model.train(dataloader_train, dataloader_valid, epoch=20)

save(output)[source]

Save model weights and training logs

Save model weights in a file specified with the output argument. The extension of the output file should be ‘.pth’; if not, ‘.pth’ is appended to the output file path. Additionally, if training logs and test outputs are present, they are saved in text files with the same name as weights but with ‘.train_stats.txt’ and ‘.test_outputs.txt’ extensions, respectively.

Parameters:: output (str) – A file path to save the model weights.

Examples

>>> import torch
>>> from cvtk.ml import DataLabel
>>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader, ModuleCore
>>>
>>> datalabel = DataLabel(['leaf', 'flower', 'root'])
>>> model = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT')
>>>
>>> # training
>>> # ...
>>> model.save('output/plant_organ_classification.pth')

test(dataloader, criterion=None)[source]

Test the model with the provided dataloader

Test the model with the provided dataloader.

Parameters:

data (torch.utils.data.DataLoader) – A dataloader for testing.
criterion (torch.nn.Module|None) – A loss function for training. Default is None and torch.nn.CrossEntropyLoss is used.

inference(data, value='prob+label', format='pandas', batch_size=32, num_workers=8)[source]

Perform inference with the input images

Perform inference with the input images with the trained model. The format of ouput can be specified with output and format arguments.

Parameters:

dataloader (torch.utils.data.DataLoader) – A dataloader for inference.
output (str) – A string to specify the information of inference result for output. Probabilities (‘prob’), labels (‘label’), or both (‘prob+label’) can be specified.
format (str) – A string to specify output format in Pandas Data.Frame (‘pandas’), NumPy array (‘numpy’), list (‘list’), or tuple (‘tuple’).

Examples

>>> import torch
>>> from cvtk.ml import DataLabel
>>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader, ModuleCore
>>>
>>> datalabel = DataLabel(['leaf', 'flower', 'root'])
>>>
>>> model = ModuleCore(datalabel, 'efficientnet_b7', 'plant_organs.pth')
>>>
>>> transform = DataTransform(600)
>>> dataset = Dataset(datalabel, 'sample.jpg', transform)
>>> dataloader = DataLoader(dataset, batch_size=32, num_workers=4)
>>>
>>> probs = model.inference(dataloader)
>>> probs.to_csv('inference_results.txt', sep = '   ', header=True, index=True, index_label='image')

cvtk.ml.torchutils.plot_trainlog(train_log, output=None, title='Training Statistics', mode='lines', width=600, height=800, scale=1.0)[source]

Plot training log

Plot loss and accuracy at each epoch from the training log which is expected to be saved in a tab-separated file with the following format:

epoch  train_loss  train_acc  valid_loss  valid_acc
    1.40679     0.22368    1.24780     0.41667
    1.21213     0.48684    1.09401     0.83334
    1.00425     0.81578    0.88967     0.83334
    0.78659     0.82894    0.64055     0.91666
    0.46396     0.96052    0.39010     0.91666

Parameters:

train_log (str) – A path to a tab-separated file containing training logs.
output (str) – A file path to save the output images. If not provided, the plot is shown on display.
width (int) – A width of the output image.
height (int) – A height of the output image.
scale (float) – The scale of the output image, which is used to adjust the resolution.

cvtk.ml.torchutils.plot_cm(test_outputs, output=None, title='Confusion Matrix', xlab='Predicted Label', ylab='True Label', colorscale='YlOrRd', width=600, height=600, scale=1.0)[source]

Plot a confusion matrix from test outputs

Plot a confusion matrix from test outputs. The test outputs are saved in a tab-separated file, where the first column is the path to the image, the second column is the true label, and the following columns are the predicted probabilities for each class. The example of the test outputs is as follows:

image  label   leaf     flower   root
JPG  leaf    0.54791  0.20376  0.24833
JPG  root    0.06158  0.02184  0.91658
JPG  leaf    0.70320  0.04808  0.24872
JPG  flower  0.04723  0.90061  0.05216
JPG  flower  0.30027  0.63067  0.06906
JPG  leaf    0.52753  0.43249  0.03998
JPG  root    0.21375  0.14829  0.63796

Parameters:

test_outputs (str) – A path to a tab-separated file containing test outputs.
output (str) – A file path to save the output images. If not provided, the plot is shown on display.
width (int) – A width of the output image.
height (int) – A height of the output image.
scale (float) – The scale of the output image, which is used to adjust the resolution.