Domain Discrepancy Measure

A Domain Discrepancy Measure is a statistical metric used to quantify the difference between data distributions of a source domain and a target domain, facilitating effective domain adaptation in machine learning tasks.

AKA: Domain Divergence Metric, Distribution Discrepancy Measure, Domain Shift Metric, Domain Distance Function.
Context:
- It can be utilized to assess the degree of difference between source and target domain distributions, guiding the adaptation process in transfer learning.
- It can be implemented using various statistical methods such as:
  - Maximum Mean Discrepancy (MMD), for measuring the distance between mean embeddings of distributions in a reproducing kernel Hilbert space.
  - Kullback-Leibler (KL) Divergence, to quantify how one probability distribution diverges from a second, expected probability distribution.
  - Wasserstein Distance, for computing the optimal transport cost between distributions, capturing differences in their support.
  - Paired Hypotheses Discrepancy (PHD), designed for complex models and multi-class classification tasks, providing computational efficiency and theoretical guarantees.
- It can be applied in unsupervised domain adaptation scenarios where labeled data in the target domain is unavailable.
- It can inform the design of domain-invariant feature representations by minimizing the measured discrepancy during model training.
- It can be integrated into adversarial learning frameworks to align feature distributions across domains.
- It can be extended to multi-source domain adaptation by aggregating discrepancies from multiple source domains.
- It can be critical in applications such as cross-domain image classification, sentiment analysis across different languages, and medical diagnosis using data from diverse populations.
- ...
Example(s):
- Applying MMD to align feature distributions between synthetic and real-world images in object recognition tasks.
- Utilizing Wasserstein Distance to measure and minimize the discrepancy between source and target domains in unsupervised domain adaptation for digit classification.
- Implementing paired hypotheses discrepancy to evaluate domain differences in complex models handling multi-class classification without labeled target data.
- ...
Counter-Example(s):
- Euclidean Distance, which measures point-wise distance and does not account for distributional differences between domains.
- Cosine Similarity, focusing on the angle between vectors rather than the statistical properties of entire distributions.
- Standard Accuracy Metrics, which assess model performance but do not provide insights into domain discrepancies.
- ...
See: Domain Adaptation, Transfer Learning, Maximum Mean Discrepancy, Wasserstein Distance, Kullback-Leibler Divergence, Paired Hypotheses Discrepancy.

References

2024

(Huang et al., 2024) ⇒ Jiawei Huang, Yongxin Wang, Xiaowei Xu, & Xiang Li. (2024). "Domain Discrepancy Minimization for Unsupervised Domain Adaptation: A Survey".
- QUOTE: Domain discrepancy minimization is a central theme in unsupervised domain adaptation (UDA), where the goal is to reduce the distribution gap between source domain and target domain. This survey reviews representative discrepancy-based UDA methods, including those based on maximum mean discrepancy, central moment discrepancy, adversarial learning, and optimal transport-based approaches. The paper highlights advances in feature alignment, theoretical guarantees, and challenges such as negative transfer and scalability.

2019

(Lee et al., 2019) ⇒ Jongyeong Lee, Akihiro Matsuo, & Masashi Sugiyama. (2019). "Domain Discrepancy Measure for Complex Models in Unsupervised Domain Adaptation". arXiv Preprint.
- QUOTE: We propose a novel domain discrepancy measure, called the paired hypotheses discrepancy (PHD), to overcome shortcomings of existing measures in unsupervised domain adaptation. PHD is computationally efficient, applicable to multi-class classification, and effective even for complex models such as deep neural networks. Theoretical analysis shows PHD provides tight generalization error bounds, and experiments demonstrate its practical usefulness
(Lee et al., 2019b) ⇒ Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, & Daniel Ulbricht. (2019). "Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation". In: Proceedings of CVPR 2019.
- QUOTE: We propose the sliced Wasserstein discrepancy (SWD) as a geometrically meaningful measure of dissimilarity between the outputs of task-specific classifiers in unsupervised domain adaptation. SWD enables efficient, end-to-end distribution alignment using optimal transport theory, and advances the state-of-the-art in tasks such as image classification, semantic segmentation, and object detection.
(Zhang et al., 2019) ⇒ Y. Zhang, Y. Liu, & X. Wang. (2019). "Domain Adaptation via Wasserstein Distance in Deep Learning". In: Neural Processing Letters.
- QUOTE: This paper investigates domain adaptation using the Wasserstein distance as a discrepancy measure between source and target domain distributions. The method integrates optimal transport with deep learning to achieve improved feature alignment and classification accuracy in unsupervised settings.

Domain Discrepancy Measure

References

2024

2019

Navigation menu

Search