Synthetic Dataset
Jump to navigation
Jump to search
A Synthetic Dataset is a machine-generated dataset composed of synthetic data records.
- AKA: Artificial Data.
- Context:
- It can be produced by a Synthetic Dataset Generator (solving a synthetic data generation task).
- It can represent some Random Experiment (that was designed to have certain characteristics.)
- ...
- Example(s):
- Counter-Example(s):
- any Real-World Dataset.
- See: Synthetic, Modified Dataset.
References
2010
- (Adä et al., 2010) ⇒ Iris Adä, and Michael R. Berthold. (2010). “The New Iris Data: Modular Data Generators.” In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2010). doi:10.1145/1835804.1835858
2009
- (Gentle, 2009) ⇒ James E. Gentle. (2009). “Computational Statistics." Springer. ISBN:978-0-387-98143-7
- QUOTE: Many exercises require the student to generate artificial data. While such datasets may lack any apparent intrinsic interest, I believe that they are often the best for learning how a statistical method works. One of my firm beliefs is
If I understand something, I can simulate it.
- QUOTE: Many exercises require the student to generate artificial data. While such datasets may lack any apparent intrinsic interest, I believe that they are often the best for learning how a statistical method works. One of my firm beliefs is
1999
- (Melli, 1999b) ⇒ Gabor Melli. (1999). “The datgen Dataset Generator." Version 3.1 http://www.datasetgenerator.com