Contrastive Learning Technique for Legal Text
Jump to navigation
Jump to search
A Contrastive Learning Technique for Legal Text is a self-supervised learning technique that trains legal embedding models to produce similar representations for related legal texts and dissimilar representations for unrelated legal texts.
- AKA: Legal Contrastive Learning Technique, Legal Text Similarity Learning Technique, Legal Representation Learning Technique.
- Context:
- It can typically employ Positive Pairs with semantically similar legal documents.
- It can typically utilize Negative Pairs with semantically different legal documents.
- It can typically optimize InfoNCE Loss for contrastive objective.
- It can often apply Hard Negative Mining with challenging negative examples.
- It can often use In-Batch Negatives for computational efficiency.
- It can often implement Triplet Loss for margin-based optimization.
- It can often integrate Data Augmentation for positive pair generation.
- It can range from being a Supervised Contrastive Learning for Legal Text to being a Self-Supervised Contrastive Learning for Legal Text, depending on its label requirement.
- It can range from being a Pairwise Contrastive Learning for Legal Text to being a Multiple-Instance Contrastive Learning for Legal Text, depending on its comparison structure.
- It can range from being a Single-Modal Contrastive Learning for Legal Text to being a Multi-Modal Contrastive Learning for Legal Text, depending on its data modality.
- It can range from being a Instance-Level Contrastive Learning for Legal Text to being a Sentence-Level Contrastive Learning for Legal Text, depending on its granularity level.
- ...
- Examples:
- Legal Dense Retrieval Trainings, such as:
- Legal DPR Training, using question-passage pairs for legal retrieval.
- Legal SimCSE Training, employing dropout augmentation for legal sentence embedding.
- Legal Similarity Models, such as:
- ...
- Legal Dense Retrieval Trainings, such as:
- Counter-Examples:
- Supervised Legal Classification, which uses labeled categories rather than similarity.
- Legal Language Modeling, which predicts next tokens rather than learns similarity.
- Legal Clustering, which groups documents rather than learns representations.
- See: Bi-Encoder Model, Dense Retrieval Method, Hard Negative Mining, InfoNCE Loss, Triplet Loss, Legal Text Embedding, Self-Supervised Learning, Legal Language Embedding Model, Legal Machine Learning Method.