ML Evaluation Error

From GM-RKB

Jump to navigation Jump to search

An ML Evaluation Error is a model evaluation error type that is a machine learning evaluation error occurring during machine learning model evaluation that leads to misleading performance metrics.

AKA: Machine Learning Evaluation Mistake, ML Assessment Error, Model Evaluation Flaw.
Context:
- It can typically cause Inflated Performance Metrics through improper evaluation procedures and biased assessment.
- It can typically result from Data Leakage between training sets and test sets.
- It can typically arise from Improper Cross-Validation using incorrect splitting strategy.
- It can typically occur through Selection Bias in dataset construction and sample selection.
- It can typically manifest in Overfitting to Test Set via repeated evaluation and hyperparameter tuning.
- ...
- It can often involve Temporal Leakage when future information influences past predictions.
- It can often include Label Leakage where target variables contaminate features.
- It can often feature Distribution Shift between training distribution and deployment distribution.
- It can often exhibit Cherry-Picking Results through selective reporting and p-hacking.
- ...
- It can range from being a Subtle ML Evaluation Error to being an Obvious ML Evaluation Error, depending on its detection difficulty.
- It can range from being a Data-Related ML Evaluation Error to being a Methodology-Related ML Evaluation Error, depending on its error source.
- It can range from being a Minor ML Evaluation Error to being a Critical ML Evaluation Error, depending on its impact severity.
- It can range from being a Common ML Evaluation Error to being a Rare ML Evaluation Error, depending on its occurrence frequency.
- It can range from being a Preventable ML Evaluation Error to being an Inherent ML Evaluation Error, depending on its avoidability.
- ...
- It can undermine Model Reliability in production environments.
- It can mislead Model Selection Decisions through false performance indicators.
- It can affect Research Reproducibility via unreliable results.
- It can impact Business Decisions based on flawed evaluation.
- It can necessitate Evaluation Protocol Revision for proper assessment.
- ...
Example(s):
- Data Leakage Errors, such as:
  - Train-Test Leakage where training data appears in test set.
  - Feature Leakage where future information contaminates input features.
  - Target Leakage where label information influences prediction variables.
  - Preprocessing Leakage where data normalization uses test set statistics.
- Temporal Pitfalls, such as:
  - Temporal Data Leakage using future data for past prediction.
  - Time Series Split Error ignoring temporal ordering in cross-validation.
  - Lookahead Bias incorporating future knowledge in historical analysis.
- Sampling Pitfalls, such as:
  - Sample Selection Bias with non-representative samples.
  - Survivorship Bias excluding failed instances from evaluation.
  - Class Imbalance Ignorance using accuracy metric on imbalanced datasets.
- Validation Pitfalls, such as:
  - Overfitting to Validation Set through excessive hyperparameter tuning.
  - Data Snooping making modeling decisions based on test set observation.
  - Multiple Testing Problem without correction for multiple comparisons.
- ...
Counter-Example(s):
- Proper Evaluation Practices, which follow rigorous methodology and avoid evaluation errors.
- Statistical Assumption Violations, which are theoretical issues rather than evaluation mistakes.
- Model Architecture Flaws, which are design problems rather than evaluation errors.
See: Machine Learning Task, Machine Learning Evaluation, Model Validation, Cross-Validation, Machine Learning Data Leakage, Overfitting, Selection Bias.

Retrieved from "http://www.gabormelli.com/RKB/index.php?title=ML_Evaluation_Error&oldid=963757"