# CSE5519 Advances in Computer Vision (Topic F: 2021 and before: Representation Learning) ## A Simple Framework for Contrastive Learning of Visual Representations [link to the paper](https://arxiv.org/pdf/2002.05709) ~~Laughing my ass off when I see 75% accuracy on ImageNet. Can't believe what the authors think after few years, when Deep Learning is becoming the dominant paradigm in Computer Vision.~~ In this work, we introduce a simple framework for contrastive learning of visual representations, which we call SimCLR. Wait, that IS a NEURAL NETWORK? ## General Framework A stochastic data augmentation module A neural network base encoder $f(\cdot)$ A small neural network projection head $g(\cdot)$ A contrastive loss function ## Novelty in SimCLR Semi-supervised learning with data augmentation. > [!TIP] > > In the section "Training with Large Batch Size", the authors mentioned that: > > To keep it simple, we do not train the model with a memory bank (Wu et al., 2018; He et al., 2019). Instead, we vary the training batch size N from 256 to 8192. A batch size of 8192 gives us 16382 negative examples per positive pair from both augmentation views. They use LARS optimizer for stabilizing the training. > > What does memory bank means here? And what is LARS optimizer, and how does it benefit the training?