Labeled Contract Clause Dataset

From GM-RKB

Jump to navigation Jump to search

A Labeled Contract Clause Dataset is an annotated legal document training dataset that contains contract clauses marked with contract smell labels for model development.

AKA: Annotated Contract Clause Collection, Contract Quality Training Dataset, Labeled Legal Clause Corpus.
Context:
- It can typically include Labeled Contract Clause Annotations for multiple quality issue types.
- It can typically contain Labeled Contract Clause Metadata such as contract type and jurisdiction.
- It can typically follow Labeled Contract Clause Annotation Guidelines for consistency.
- It can typically support Labeled Contract Clause Stratification for balanced training.
- It can typically provide Labeled Contract Clause Splits for train/validation/test sets.
- ...
- It can often derive from Labeled Contract Clause Sources like CUAD or proprietary collections.
- It can often undergo Labeled Contract Clause Quality Control through inter-annotator agreement.
- It can often enable Labeled Contract Clause Benchmarking for model comparison.
- It can often incorporate Labeled Contract Clause Version Control for reproducibility.
- ...
- It can range from being a Small Labeled Contract Clause Dataset to being a Large-Scale Labeled Contract Clause Dataset, depending on its labeled contract clause dataset size.
- It can range from being a Single-Label Contract Clause Dataset to being a Multi-Label Contract Clause Dataset, depending on its labeled contract clause dataset annotation schema.
- It can range from being a Coarse-Grained Labeled Contract Clause Dataset to being a Fine-Grained Labeled Contract Clause Dataset, depending on its labeled contract clause dataset granularity.
- It can range from being a Domain-Specific Labeled Contract Clause Dataset to being a Cross-Domain Labeled Contract Clause Dataset, depending on its labeled contract clause dataset coverage.
- ...
- It can facilitate Labeled Contract Clause Model Training for detection systems.
- It can enable Labeled Contract Clause Evaluation through test splits.
- It can support Labeled Contract Clause Research in legal NLP.
- It can provide Labeled Contract Clause Baselines for performance comparison.
- ...
Example(s):
- Public Labeled Contract Clause Datasets, such as:
  - CUAD-Based Contract Smell Dataset derived from CUAD annotations.
  - LexGLUE Contract Clause Dataset for legal language understanding.
  - ContractNLI Labeled Dataset for contract inference tasks.
  - MAUD Contract Clause Dataset for M&A due diligence.
- Annotation Method Contract Clause Datasets, such as:
  - Expert-Labeled Contract Clause Dataset with lawyer annotations.
  - Crowdsourced Contract Clause Dataset using multiple annotators.
  - LLM-Generated Contract Clause Dataset via automated labeling.
  - Hybrid-Annotated Contract Clause Dataset combining methods.
- Domain-Specific Contract Clause Datasets, such as:
  - Employment Contract Clause Dataset for HR agreements.
  - Real Estate Contract Clause Dataset for property transactions.
  - Technology Contract Clause Dataset for software licenses.
  - Financial Contract Clause Dataset for banking agreements.
- ...
Counter-Example(s):
- Unlabeled Contract Corpus, which lacks labeled contract clause annotations.
- General Legal Dataset, which doesn't focus on contract clauses.
- Code Quality Dataset, which targets software rather than contracts.
See: Training Dataset, Legal Document Dataset, Annotated Text Dataset, Contract Clause, Machine Learning Dataset, NLP Benchmark Dataset, Document Annotation Task.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Labeled_Contract_Clause_Dataset&oldid=957496"