Synthetic Reference Dataset

From GM-RKB

(Redirected from Synthetic Benchmark Dataset)

Jump to navigation Jump to search

A Synthetic Reference Dataset is a reference dataset that contains artificially generated references created through automated methods or rule-based systems for evaluation benchmarking.

AKA: Artificial Reference Dataset, Generated Reference Dataset, Automated Reference Collection, Synthetic Benchmark Dataset.
Context:
- It can typically enable Large-Scale Generation overcoming annotation bottlenecks.
- It can typically provide Controlled Variation for systematic testing.
- It can often reduce Dataset Creation Costs compared to human annotation.
- It can often support Edge Case Testing through targeted generation.
- It can facilitate Privacy Preservation avoiding real data exposure.
- It can enable Multilingual Coverage through translation systems.
- It can incorporate Quality Filters ensuring reference validity.
- It can integrate with Data Augmentation expanding training sets.
- It can range from being a Template-Based Synthetic Dataset to being a Model-Generated Synthetic Dataset, depending on its generation method.
- It can range from being a High-Fidelity Synthetic Dataset to being a Low-Fidelity Synthetic Dataset, depending on its realism level.
- It can range from being a Domain-Specific Synthetic Dataset to being a General Synthetic Dataset, depending on its content scope.
- It can range from being a Static Synthetic Dataset to being a Dynamic Synthetic Dataset, depending on its generation timing.
- ...
Examples:
- Generation Methods, such as:
- Task-Specific Synthetic Datasets, such as:
- Augmentation Datasets, such as:
  - Adversarial Reference Dataset for robustness testing.
  - Perturbation-Based Dataset for stability evaluation.
- ...
Counter-Examples:
- NLG Gold Reference Dataset, which uses human curation.
- Crowd-Sourced Dataset, which uses human annotation.
- Natural Corpus, which contains authentic text.
See: Reference Dataset, NLG Gold Reference Dataset, Evaluation Reference Dataset, Data Generation Method, Synthetic Data, Automated Annotation, Data Augmentation.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=Synthetic_Reference_Dataset&oldid=974734"