MuarAugment: Easiest Way to SOTA Data Augmentation
I wanted an easy way to get a state-of-the-art image augmentation pipeline with no manual iteration, no separate models to train and no thinking. To provide that, I created MuarAugment (Model Uncertainty- And Randomness-based Augmentation), a GPU-supported Python package built on Pytorch, Albumentations and Kornia.
This article guides you through incorporating MuarAugment into your computer vision pipelines.
There are a few resources you can use to master MuarAugment. There are Colab tutorials demonstrating MuarAugment. Most of the material in this article comes from those.
- Here is an end-to-end classification task using MuarAugment’s algorithm
MuAugment
on colab. (1) - Here is a template to use MuarAugment’s implementation of
RandAugment
on colab. (2) - Also see my previous Medium post for a survey of automatic data augmentation methods and the motivation for MuarAugment, as well as a walk through the RandAugment and MuarAugment algorithms.
- The MuarAugment GitHub contains the code and additional resources (give a star if and only if you find MuarAugment useful).
High-level Walkthrough
MuarAugment adapts the leading pipeline search algorithms, RandAugment[1] and the model uncertainty-based augmentation scheme[2] (called MuAugment here), and modifies them to work batch-wise, on the GPU. Kornia[3] and Albumentations are used for batch-wise and item-wise transformations respectively.
Let’s get into MuarAugment’s use.
# You can install MuarAugment via PIP:!pip install MuarAugment
MuAugment
MuAugment
is the heavy-duty automatic data augmentation policy in MuarAugment. It is simpler than GANs-based automatic data augmentation pipelines, but more complex than RandAugment
or manual augmentation.
MuAugment’s training is puzzlingly unstable on the validation set (it is monotonously decreasing on the train loss), but it ends up performing significantly better than competing methods.
For MuAugment
, simply modify the training logic and train like normal.
In PyTorch Lightning:
In pure PyTorch:
See the colab notebook tutorials on MuAugment (1) for more detail.
RandAugment
MuarAugment also provides a straightforward implementation of RandAugment, a simple but effective automatic data augmentation policy, using Albumentations:
See the colab notebook tutorial (#2) for more detail on AlbumentationsRandAugment.
Enjoy
I will use MuarAugment when training neural nets that input images, using MuAugment
for image classification and RandAugment
for other tasks. (Currently, I am using AlubumentationsRandAugment
in a generative model composed of transformers.) I hope you’ll find this package to be as useful as I do.
Papers Referenced
1. Cubuk, Ekin et al. “RandAugment: Practical data augmentation with no separate search,” 2019, [arXiv](http://arxiv.org/abs/1909.13719).
2. Wu, Sen et al. “On the Generalization Effects of Linear Transformations in Data Augmentation,” 2020, [arXiv](https://arxiv.org/abs/2005.00695).
3. Riba, Edgar et al. “Kornia: an Open Source Differentiable Computer Vision Library for PyTorch,” 2019, [arXiv](https://arxiv.org/abs/1910.02190).