cvtk.ml.torchutils
- class cvtk.ml.torchutils.DataTransform(shape: int | tuple[int, int], is_train=False)[source]
Pipeline for preprocessing images
The class composes several fundamental transforms for image preprocessing and converts them to a torchvision.transforms.Compose instance. It is intended for use by beginners. If user wants to customize their own image preprocessing pipeline, it is recommended to use torchvision.transforms.Compose directly.
- Parameters:
shape – The resolution of preprocessed images.
is_train – Generate pipeline for trianing if True, otherwise for inference. Pipeline for training includes random cropping, flipping, and rotation; pipeline for inference only includes resizing and normalization.
Examples
>>> from cvtk.ml.torchutils import DataTransform >>> >>> transform_train = DataTransform(224, is_train=True) >>> print(transform_train.pipeline) >>> >>> transform_inference = DataTransform(224) >>> print(transforms_inference.pipeline)
- class cvtk.ml.torchutils.Dataset(datalabel, dataset: str | list | tuple, transform: Compose | DataTransform | None = None, upsampling: bool = False)[source]
A class to manupulate image data for training and inference
Dataset is a class that generates a dataset for training or testing with PyTorch. It loads images from a directory (the subdirectories are recursively loaded), a list, a tuple, or a tab-separated (TSV) file. For the TSV file, the first column is recognized as the the path to the image and the second column as correct label if present. For traning, validation, and test, data should be input with TSV files containing two columns.
Imbalanced data will make the model less sensitive to minority classes with small sample sizes compared to normal data for balanced data. Therefore, if models are created without properly addressing imbalanced data, problems will arise in terms of accuracy, computational complexity, etc. It is best to have balanced data during the data collection phase. However, if it is difficult to obtain balanced data in some situations, upsampling is used so that the samples in the minority class are equal in number to those in the major class. In this class, upsampling is performed by specifying upsampling=TRUE.
- Parameters:
datalabel – A DataLabel instance. This datalabel is used to convert class labels to integers.
dataset – A path to a directory, a list, a tuple, or a TSV file.
transform – A transform pipeline of image processing.
balance_train – If True, the number of images in each class is balanced
Examples
>>> from cvtk.ml import DataLabel >>> from cvtk.ml.torchutils import Dataset, DataTransform >>> >>> datalabel = DataLabel(['leaf', 'flower', 'root']) >>> >>> transform = DataTransform(224, is_train=True) >>> >>> dataset = Dataset(datalabel, 'train.txt', transform) >>> print(len(dataset)) 100 >>> img, label = dataset[0] >>> print(img.shape) >>> print(label)
- class cvtk.ml.torchutils.DataLoader(*args, **kwargs)[source]
Create dataloader to manage data for training and inference
This class simply creates a torch.utils.data.DataLoader instance to manage data for training and inference.
- Parameters:
dataset (cvtk.ml.torchutils.DataSet) – A dataset for training and inference.
batch_size (int) – A batch size for training and inference.
num_workers (int) – The number of workers for data loading.
shuffle (bool) – If True, the data is shuffled at every epoch.
- Returns:
A torch.utils.data.DataLoader instance.
Examples
>>> from cvtk.ml >>> from cvtk.ml import DataLabel >>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader >>> >>> datalabel = DataLabel(['leaf', 'flower', 'root']) >>> transform = DataTransform(224, is_train=True) >>> dataset = Dataset(datalabel, 'train.txt', transform) >>> dataloader = DataLoader(dataset, batch_size=32, num_workers=4) >>>
- class cvtk.ml.torchutils.ModuleCore(datalabel, model, weights=None, workspace=None)[source]
A class provides training and inference functions for a classification model using PyTorch
ModuleCore is a class that provides training and inference functions for a classification model.
- Parameters:
datalabel (str|list|tuple|DataLabel) – A DataLabel instance containing class labels. If string (of file path), list, tuple is given, it is converted to a DataLabel instance.
model (str|torch.nn.Module) – A string to specify a model or a torch.nn.Module instance.
weights (str) – A file path to model weights.
workspace (str) – A temporary directory path to save intermediate checkpoints and training logs. If not given, the intermediate results are not saved.
- device
A device to run the model. Default is ‘cuda’ if available, otherwise ‘cpu’.
- Type:
str
- datalabel
A DataLabel instance containing class labels.
- Type:
DataLabel
- model
A model of torch.nn.Module instance.
- Type:
torch.nn.Module
- workspace
A temporary directory path.
- Type:
str
- train_stats
A dictionary to save training statistics
- Type:
dict
- test_stats
A dictionary to save test statistics
- Type:
dict
Examples
>>> import torch >>> import torchvision >>> from cvtk.ml.torchutils import ModuleCore >>> >>> datalabel = ['leaf', 'flower', 'root'] >>> m = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT') >>> >>> datalabel = 'class_label.txt' >>> m = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT')
- train(train, valid=None, test=None, epoch=20, optimizer=None, criterion=None, resume=False)[source]
Train the model with the provided dataloaders
Train the model with the provided dataloaders. The training statistics are saved in the temporary directory.
- Parameters:
train (torch.utils.data.DataLoader) – A dataloader for training.
valid (torch.utils.data.DataLoader) – A dataloader for validation.
test (torch.utils.data.DataLoader) – A dataloader for testing.
epoch (int) – The number of epochs to train the model.
optimizer (torch.optim.Optimizer|None) – An optimizer for training. Default is None and torch.optim.SGD is used.
criterion (torch.nn.Module|None) – A loss function for training. Default is None and torch.nn.CrossEntropyLoss is used.
resume (bool) – If True, the training resumes from the last checkpoint which is saved in the temporary directory specified with
temp_dirpath.
Examples
>>> import torch >>> from cvtk.ml import DataLabel >>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader, ModuleCore >>> >>> datalabel = DataLabel(['leaf', 'flower', 'root']) >>> >>> model = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT') >>> >>> # train dataset >>> transforms_train = DataTransform(600, is_train=True) >>> dataset_train = Dataset(datalabel, 'train.txt', transforms_train) >>> dataloader_train = DataLoaders(dataset_train, batch_size=32, num_workers=4) >>> # valid dataset >>> transforms_valid = DataTransform(600, is_train=False) >>> dataset_valid = Dataset(datalabel, 'valid.txt, transforms_valid) >>> dataloader_valid = DataLoader(dataset_valid, batch_size=32, num_workers=4) >>> >>> model.train(dataloader_train, dataloader_valid, epoch=20)
- save(output)[source]
Save model weights and training logs
Save model weights in a file specified with the output argument. The extension of the output file should be ‘.pth’; if not, ‘.pth’ is appended to the output file path. Additionally, if training logs and test outputs are present, they are saved in text files with the same name as weights but with ‘.train_stats.txt’ and ‘.test_outputs.txt’ extensions, respectively.
- Parameters:
output (str) – A file path to save the model weights.
Examples
>>> import torch >>> from cvtk.ml import DataLabel >>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader, ModuleCore >>> >>> datalabel = DataLabel(['leaf', 'flower', 'root']) >>> model = ModuleCore(datalabel, 'efficientnet_b7', 'EfficientNet_B7_Weights.DEFAULT') >>> >>> # training >>> # ... >>> model.save('output/plant_organ_classification.pth')
- test(dataloader, criterion=None)[source]
Test the model with the provided dataloader
Test the model with the provided dataloader.
- Parameters:
data (torch.utils.data.DataLoader) – A dataloader for testing.
criterion (torch.nn.Module|None) – A loss function for training. Default is None and torch.nn.CrossEntropyLoss is used.
- inference(data, value='prob+label', format='pandas', batch_size=32, num_workers=8)[source]
Perform inference with the input images
Perform inference with the input images with the trained model. The format of ouput can be specified with output and format arguments.
- Parameters:
dataloader (torch.utils.data.DataLoader) – A dataloader for inference.
output (str) – A string to specify the information of inference result for output. Probabilities (‘prob’), labels (‘label’), or both (‘prob+label’) can be specified.
format (str) – A string to specify output format in Pandas Data.Frame (‘pandas’), NumPy array (‘numpy’), list (‘list’), or tuple (‘tuple’).
Examples
>>> import torch >>> from cvtk.ml import DataLabel >>> from cvtk.ml.torchutils import DataTransform, Dataset, DataLoader, ModuleCore >>> >>> datalabel = DataLabel(['leaf', 'flower', 'root']) >>> >>> model = ModuleCore(datalabel, 'efficientnet_b7', 'plant_organs.pth') >>> >>> transform = DataTransform(600) >>> dataset = Dataset(datalabel, 'sample.jpg', transform) >>> dataloader = DataLoader(dataset, batch_size=32, num_workers=4) >>> >>> probs = model.inference(dataloader) >>> probs.to_csv('inference_results.txt', sep = ' ', header=True, index=True, index_label='image')
- cvtk.ml.torchutils.plot_trainlog(train_log, output=None, title='Training Statistics', mode='lines', width=600, height=800, scale=1.0)[source]
Plot training log
Plot loss and accuracy at each epoch from the training log which is expected to be saved in a tab-separated file with the following format:
epoch train_loss train_acc valid_loss valid_acc 1 1.40679 0.22368 1.24780 0.41667 2 1.21213 0.48684 1.09401 0.83334 3 1.00425 0.81578 0.88967 0.83334 4 0.78659 0.82894 0.64055 0.91666 5 0.46396 0.96052 0.39010 0.91666
- Parameters:
train_log (str) – A path to a tab-separated file containing training logs.
output (str) – A file path to save the output images. If not provided, the plot is shown on display.
width (int) – A width of the output image.
height (int) – A height of the output image.
scale (float) – The scale of the output image, which is used to adjust the resolution.
- cvtk.ml.torchutils.plot_cm(test_outputs, output=None, title='Confusion Matrix', xlab='Predicted Label', ylab='True Label', colorscale='YlOrRd', width=600, height=600, scale=1.0)[source]
Plot a confusion matrix from test outputs
Plot a confusion matrix from test outputs. The test outputs are saved in a tab-separated file, where the first column is the path to the image, the second column is the true label, and the following columns are the predicted probabilities for each class. The example of the test outputs is as follows:
image label leaf flower root 1.JPG leaf 0.54791 0.20376 0.24833 2.JPG root 0.06158 0.02184 0.91658 3.JPG leaf 0.70320 0.04808 0.24872 4.JPG flower 0.04723 0.90061 0.05216 5.JPG flower 0.30027 0.63067 0.06906 6.JPG leaf 0.52753 0.43249 0.03998 7.JPG root 0.21375 0.14829 0.63796
- Parameters:
test_outputs (str) – A path to a tab-separated file containing test outputs.
output (str) – A file path to save the output images. If not provided, the plot is shown on display.
width (int) – A width of the output image.
height (int) – A height of the output image.
scale (float) – The scale of the output image, which is used to adjust the resolution.