Paper Review Series— Domain Generalization Research

Kevin Li
5 min readApr 13, 2021
Source: Unsplash Image

Hey guys, this post is intended to record the papers that I’ve studied throughout my term as an undergraduate researcher at National Taiwan University, focusing on Domain Generalization in Computer Vision. I will be listing out papers that I believe are relevant, and continuously update this post with new blogpost reviews that I write. The goal isn’t just to recite these proposed models, but to promote discussions on the models’ pros and cons so as to explore potential improvement directions. Without further ado, let’s start with a gentle introduction to domain generalization.

Topics of Papers

  • Domain Generalization: Meta-learning methods, Data augmentation methods, Adversarial Training methods… etc.
  • Self-supervised Learning
  • Transformers in Computer Vision (to be updated)

What is Domain Shift and why it matters

Domain shift, a.k.a. distribution shift, has been a critical problem in almost every field of machine learning. An intuitive example would be the fact that an autopilot system trained on a dataset collected in New York, may perform poorly in New Delhi, since the cars might not be in the same shape, traffic congestion has gone worse…etc. Neural Networks are good at providing deterministic answers, but however precise they are, models can only be as fair as the given distribution, which is why out-of-distribution domain shift has been challenging to solve.

To address out-of-distribution domain shift, conventional training-validation dataset split paradigm is no longer sufficient since both sets are from the same distribution. One of the fields to promote the robustness of models is Domain Generalization, involving methods such as multi-domain learning (to provide diverse cross-domain cues for Neural Network to learn), meta-learning (simulate domain shift within the training procedure), data-hallucination, and adversarial training (to challenge our models’ dependence on irrelevant cues of the data by adding human-imperceptible noises) just to name a few. The end goal of DG (short for Domain Generalization) is to enable a model, trained on a finite set of domains, that’s ready to use out of the box for other domains.

However, most of the methods mentioned above still suffer some extent of overfitting to the source domains that the models are trained on. Recent advent of self-supervised representation learning methods may shed some light on possible alternative approaches.

Related Topics to Domain Generalization

The field of domain generalization actually was inspired by few-shot learning methods. There are two main types of few-shot methods that address domain shift:

  • Domain Adaptation: DA can be considered a subfield of Transfer Learning. DA models are also trained on a set of source domains, aiming to perform well on novel target domains. The difference is that some small amount of target domain labelled data is available for the model to adapt to.
  • Unsupervised Domain Adaptation: UDA models are also trained on labeled data from a source domain to achieve better performance on data from a target domain, with access to only unlabeled data in the target domain.

Types of Domain Generalization Problems

There are two main types of DG problem settings: Homogeneous and Heterogenous domain generalization. Homogeneous DG is when the label space of target domain(s) are the same as source domains. In this setting, the model could be trained on cartoons and painting of horses, and then asked to identify horse in photos.

Homogeneous DG example (source: PACS dataset paper)

Heterogeneous DG is a more challenging setting where the target domains’ label space are not different, potentially disjoint, from the source domains. A heterogenous DG model may be trained on images of traffic signs, but then asked to identify aircrafts. While the distribution mismatch is severe, this is arguably a more realistic scenario, considering the ubiquitous ImageNet pretraining paradigm. For many practical applications, when fine-tuning data is not available, a standard practice is to use an ImageNet CNN off-the-shelf as a fixed feature extractor, and then train a shallow classifier for the new problem.

Heterogenous DG example: Visual Decathlon Benchmark

What to expect in this series

The field of DG has gain a lot of momentum in the past five years, though the performance is still far from ideal for practical applications. In this series, we will be reading papers in a topical order, starting from meta-learning based methods, data augmentation to self-supervised learning based methods. There might also be some posts about Transformers in vision problem since I believe that the attention mechanism is also a promising direction for DG research. Deep Learning is such a young field where groundbreaking findings emerge almost on a monthly basis, and domain generalization is no excpetion. I personally find it a pleasure reading about these creative solutions, so let’s start our journey!

Catalogue of Papers

(Updated on 2021. 04.13)

Good to know Prerequisites

  • Meta-learning ::
    (MAML) Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks, ICML 2017 (Source: https://arxiv.org/abs/1703.03400)
  • Generative models :: AutoEncoder, VAE, and GAN
  • Neural Network can “memorize” answers :: Understanding deep learning requires rethinking generalization, ICML 2017 (Source: https://arxiv.org/abs/1611.03530)

Survey Paper

Meta-Learning on DG

  • MetaReg: Towards Domain Generalization using Meta-Regularization, NeurIPS 2018. (Source: link to NeurIPS)

Data Augmentation on DG

Self-Supervised Learning on DG

Other Self-Supervised Learning methods (not specific to DG)

  • SimCLR: A Simple Framework for Contrastive Learning of Visual Representation
  • Contrastive Predictive Coding (CPC v1 and v2)
  • Contrastive Multiview Coding (CMC)
  • Momentum Contrastive Learning (MoCo v1 and v2)
  • Bootstrap Your Own Latent (BYOL)
  • Simple Siamese (SimSiam)

--

--

Kevin Li

Student Researcher @ Berkeley AI Research | Incoming ML Engineer @ Adobe (Firefly)