33 lines
1.3 KiB
Markdown
33 lines
1.3 KiB
Markdown
# CSE5519 Advances in Computer Vision (Topic F: 2021 and before: Representation Learning)
|
|
|
|
## A Simple Framework for Contrastive Learning of Visual Representations
|
|
|
|
[link to the paper](https://arxiv.org/pdf/2002.05709)
|
|
|
|
~~Laughing my ass off when I see 75% accuracy on ImageNet. Can't believe what the authors think after few years, when Deep Learning is becoming the dominant paradigm in Computer Vision.~~
|
|
|
|
In this work, we introduce a simple framework for contrastive learning of visual representations, which we call SimCLR.
|
|
|
|
Wait, that IS a NEURAL NETWORK?
|
|
|
|
## General Framework
|
|
|
|
A stochastic data augmentation module
|
|
|
|
A neural network base encoder $f(\cdot)$
|
|
|
|
A small neural network projection head $g(\cdot)$
|
|
|
|
A contrastive loss function
|
|
|
|
## Novelty in SimCLR
|
|
|
|
Semi-supervised learning with data augmentation.
|
|
|
|
> [!TIP]
|
|
>
|
|
> In the section "Training with Large Batch Size", the authors mentioned that:
|
|
>
|
|
> To keep it simple, we do not train the model with a memory bank (Wu et al., 2018; He et al., 2019). Instead, we vary the training batch size N from 256 to 8192. A batch size of 8192 gives us 16382 negative examples per positive pair from both augmentation views. They use LARS optimizer for stabilizing the training.
|
|
>
|
|
> What does memory bank means here? And what is LARS optimizer, and how does it benefit the training? |