A Data Cleaning Task requires the Detection and Removal of Erroneous Data Values and Data Records.
References
- .?
- The process of manipulating or cleaning data into a standard format.
- The process of compiling multiple data records, retaining the desired data, and removing unwanted data.
- The process of ensuring that all values in a dataset are consistent and correctly recorded by removing redundancies and inconsistencies in data.
- Data Cleansing: Data is cleaned up to make sure it's correct, accurate and complete. Clean-up can involve detecting and correcting errors, supplying missing elements and value, enforcing data standards, validation and purging duplicate entries.
- Omissions remedied, errors corrected, and quality examined and assured.
2000
1999
- (Zaiane, 1999) => Osmar Zaiane. (1999). "Glossary of Data Mining Terms. University of Alberta, Computing Science CMPUT-690: Principles of Knowledge Discovery in Databases.
- Quote: Data Cleansing: Also Data Cleaning. The process of ensuring that all values in a dataset are consistent and correctly recorded by removing redundancies and inconsistencies in data.
1998
- (Kohavi & Provost, 1998) => Ron Kohavi, and Foster Provost. (1998). "Glossary of Terms." In: Machine Leanring 30(2-3).
- Data cleaning/cleansing: The process of improving the quality of the data by modifying its form or content, for example by removing or correcting data values that are incorrect. This step usually precedes the machine learning step, although the knowledge discovery process may indicate that further cleaning is desired and may suggest ways to improve the quality of the data. For example, learning that the pattern Wife implies Female from the census sample at UCI has a few exceptions may indicate a quality problem.