[D] Thinking about augmentation as invariance assumptions
Data augmentation is still used much more heuristically than it should be.
A training pipeline can easily turn into a stack of intuition, older project defaults, and transforms borrowed from papers or blog posts. The hard part is not adding augmentations. The hard part is reasoning about them: what invariance is each transform trying to impose, when is that invariance valid, how strong should the transform be, and when does it start corrupting the training signal instead of improving generalization?
The examples I have in mind come mostly from computer vision, but the underlying issue is broader. A useful framing is: every augmentation is an invariance assumption.
That framing sounds clean, but in practice it gets messy quickly. A transform may be valid for one task and destructive for another. It may help at one strength and hurt at another. Even when the label stays technically unchanged, the transform can still wash out the signal the model needs.
I wrote a longer version of this argument with concrete examples and practical details; the link is in the first comment because weekday posts here need to be text-only.
I’d be very interested to learn from your experience: - where this framing works well - where it breaks down - how you validate that an augmentation is really label-preserving instead of just plausible
[link] [comments]
Want to read more?
Check out the full article on the original site