Domain Discrepancy Measure
(Redirected from domain discrepancy)
Jump to navigation
Jump to search
A Domain Discrepancy Measure is a statistical metric used to quantify the difference between data distributions of a source domain and a target domain, facilitating effective domain adaptation in machine learning tasks.
- AKA: Domain Divergence Metric, Distribution Discrepancy Measure, Domain Shift Metric, Domain Distance Function.
- Context:
- It can be utilized to assess the degree of difference between source and target domain distributions, guiding the adaptation process in transfer learning.
- It can be implemented using various statistical methods such as:
- Maximum Mean Discrepancy (MMD), for measuring the distance between mean embeddings of distributions in a reproducing kernel Hilbert space.
- Kullback-Leibler (KL) Divergence, to quantify how one probability distribution diverges from a second, expected probability distribution.
- Wasserstein Distance, for computing the optimal transport cost between distributions, capturing differences in their support.
- Paired Hypotheses Discrepancy (PHD), designed for complex models and multi-class classification tasks, providing computational efficiency and theoretical guarantees.
- It can be applied in unsupervised domain adaptation scenarios where labeled data in the target domain is unavailable.
- It can inform the design of domain-invariant feature representations by minimizing the measured discrepancy during model training.
- It can be integrated into adversarial learning frameworks to align feature distributions across domains.
- It can be extended to multi-source domain adaptation by aggregating discrepancies from multiple source domains.
- It can be critical in applications such as cross-domain image classification, sentiment analysis across different languages, and medical diagnosis using data from diverse populations.
- ...
- Example(s):
- Applying MMD to align feature distributions between synthetic and real-world images in object recognition tasks.
- Utilizing Wasserstein Distance to measure and minimize the discrepancy between source and target domains in unsupervised domain adaptation for digit classification.
- Implementing paired hypotheses discrepancy to evaluate domain differences in complex models handling multi-class classification without labeled target data.
- ...
- Counter-Example(s):
- Euclidean Distance, which measures point-wise distance and does not account for distributional differences between domains.
- Cosine Similarity, focusing on the angle between vectors rather than the statistical properties of entire distributions.
- Standard Accuracy Metrics, which assess model performance but do not provide insights into domain discrepancies.
- ...
- See: Domain Adaptation, Transfer Learning, Maximum Mean Discrepancy, Wasserstein Distance, Kullback-Leibler Divergence, Paired Hypotheses Discrepancy.
References
2024
- (Huang et al., 2024) ⇒ Jiawei Huang, Yongxin Wang, Xiaowei Xu, & Xiang Li. (2024). "Domain Discrepancy Minimization for Unsupervised Domain Adaptation: A Survey".
- QUOTE: Domain discrepancy minimization is a central theme in unsupervised domain adaptation (UDA), where the goal is to reduce the distribution gap between source domain and target domain. This survey reviews representative discrepancy-based UDA methods, including those based on maximum mean discrepancy, central moment discrepancy, adversarial learning, and optimal transport-based approaches. The paper highlights advances in feature alignment, theoretical guarantees, and challenges such as negative transfer and scalability.
2019
- (Lee et al., 2019) ⇒ Jongyeong Lee, Akihiro Matsuo, & Masashi Sugiyama. (2019). "Domain Discrepancy Measure for Complex Models in Unsupervised Domain Adaptation". arXiv Preprint.
- QUOTE: We propose a novel domain discrepancy measure, called the paired hypotheses discrepancy (PHD), to overcome shortcomings of existing measures in unsupervised domain adaptation. PHD is computationally efficient, applicable to multi-class classification, and effective even for complex models such as deep neural networks. Theoretical analysis shows PHD provides tight generalization error bounds, and experiments demonstrate its practical usefulness
- (Lee et al., 2019b) ⇒ Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, & Daniel Ulbricht. (2019). "Sliced Wasserstein Discrepancy for Unsupervised Domain Adaptation". In: Proceedings of CVPR 2019.
- QUOTE: We propose the sliced Wasserstein discrepancy (SWD) as a geometrically meaningful measure of dissimilarity between the outputs of task-specific classifiers in unsupervised domain adaptation. SWD enables efficient, end-to-end distribution alignment using optimal transport theory, and advances the state-of-the-art in tasks such as image classification, semantic segmentation, and object detection.
- (Zhang et al., 2019) ⇒ Y. Zhang, Y. Liu, & X. Wang. (2019). "Domain Adaptation via Wasserstein Distance in Deep Learning". In: Neural Processing Letters.
- QUOTE: This paper investigates domain adaptation using the Wasserstein distance as a discrepancy measure between source and target domain distributions. The method integrates optimal transport with deep learning to achieve improved feature alignment and classification accuracy in unsupervised settings.