MuarAugment: Easiest Way to SOTA Data Augmentation

Adam Mehdi
3 min readJul 27, 2021
All images are created by the author.

I wanted an easy way to get a state-of-the-art image augmentation pipeline with no manual iteration, no separate models to train and no thinking. To provide that, I created MuarAugment (Model Uncertainty- And Randomness-based Augmentation), a GPU-supported Python package built on Pytorch, Albumentations and Kornia.

This article guides you through incorporating MuarAugment into your computer vision pipelines.

There are a few resources you can use to master MuarAugment. There are Colab tutorials demonstrating MuarAugment. Most of the material in this article comes from those.

  • Here is an end-to-end classification task using MuarAugment’s algorithm MuAugment on colab. (1)
  • Here is a template to use MuarAugment’s implementation of RandAugment on colab. (2)
  • Also see my previous Medium post for a survey of automatic data augmentation methods and the motivation for MuarAugment, as well as a walk through the RandAugment and MuarAugment algorithms.
  • The MuarAugment GitHub contains the code and additional resources (give a star if and only if you find MuarAugment useful).

High-level Walkthrough

MuarAugment adapts the leading pipeline search algorithms, RandAugment[1] and the model uncertainty-based augmentation scheme[2] (called MuAugment here), and modifies them to work batch-wise, on the GPU. Kornia[3] and Albumentations are used for batch-wise and item-wise transformations respectively.

Let’s get into MuarAugment’s use.

# You can install MuarAugment via PIP:!pip install MuarAugment

MuAugment

MuAugmentis the heavy-duty automatic data augmentation policy in MuarAugment. It is simpler than GANs-based automatic data augmentation pipelines, but more complex than RandAugment or manual augmentation.

MuAugment’s training is puzzlingly unstable on the validation set (it is monotonously decreasing on the train loss), but it ends up performing significantly better than competing methods.

Record of training vanilla ResNet50s on the Kaggle Plant Dataset.

For MuAugment, simply modify the training logic and train like normal.

In PyTorch Lightning:

In pure PyTorch:

See the colab notebook tutorials on MuAugment (1) for more detail.

RandAugment

MuarAugment also provides a straightforward implementation of RandAugment, a simple but effective automatic data augmentation policy, using Albumentations:

See the colab notebook tutorial (#2) for more detail on AlbumentationsRandAugment.

Enjoy

I will use MuarAugment when training neural nets that input images, using MuAugment for image classification and RandAugment for other tasks. (Currently, I am using AlubumentationsRandAugment in a generative model composed of transformers.) I hope you’ll find this package to be as useful as I do.

Papers Referenced

1. Cubuk, Ekin et al. “RandAugment: Practical data augmentation with no separate search,” 2019, [arXiv](http://arxiv.org/abs/1909.13719).
2. Wu, Sen et al. “On the Generalization Effects of Linear Transformations in Data Augmentation,” 2020, [arXiv](https://arxiv.org/abs/2005.00695).
3. Riba, Edgar et al. “Kornia: an Open Source Differentiable Computer Vision Library for PyTorch,” 2019, [arXiv](https://arxiv.org/abs/1910.02190).

--

--

Adam Mehdi

Thinking about AI & epistemology. Researching CV & ML as published Assistant Researcher. Studying CS @ Columbia Engineering.