LEDGAR Dataset: Difference between revisions

Revision as of 17:19, 25 April 2024

Context:
- It can be created by extracting contract provisions from documents filed with the U.S. Securities and Exchange Commission (SEC) and available on the EDGAR database.
- It can categorize contract provisions into various legal themes or topics.
- It can be used primarily for Natural Language Processing tasks, especially in the domain of legal technology and contract analysis.
- It can aid in the automated understanding and classification of legal documents.
- It can serve as a valuable resource for training and evaluating machine learning models in legal tech applications.
- It can simplify the process of contract analysis, which traditionally requires substantial manual effort and legal expertise.
- It can be integrated into legal tech software for purposes like contract review, risk assessment, and compliance checks.
- ...
Example(s):
- LEDGAR, v202x.
- ...
Counter-Example(s):
- A general-purpose language dataset not specific to legal documents.
- Raw financial datasets from EDGAR without specific labeling for NLP tasks.
- Datasets focused on other forms of legal documents like court opinions or legislations, rather than contracts.
See: Contract Analysis, Natural Language Processing, Legal Technology, Machine Learning in Law, EDGAR Database, LexGLUE.

(Jayakumar et al., 2023) ⇒ Thanmay Jayakumar, Fauzan Farooqui, and Luqman Farooqui. (2023). "Large Language Models Are Legal But They Are Not: Making the Case for a Powerful LegalLLM.” In: arXiv preprint arXiv:2311.08890. DOI:10.48550/arXiv.2311.08890

(Tuggener et al., 2020) ⇒ Don Tuggener, Pius Von Däniken, Thomas Peetz, and Mark Cieliebak. (2020). "LEDGAR: A Large-scale Multi-label Corpus for Text Classification of Legal Provisions in Contracts.” In: Proceedings of the Twelfth Language Resources and Evaluation Conference. [1]

@@ Line 16: / Line 16: @@
 ** Raw financial datasets from EDGAR without specific labeling for NLP tasks.
 ** Datasets focused on other forms of legal documents like court opinions or legislations, rather than contracts.
-* <B>See:</B> [[Contract Analysis]], [[Natural Language Processing]], [[Legal Technology]], [[Machine Learning in Law]], [[EDGAR Database]].
+* <B>See:</B> [[Contract Analysis]], [[Natural Language Processing]], [[Legal Technology]], [[Machine Learning in Law]], [[EDGAR Database]], [[LexGLUE]].
 ----