ML Evaluation Error
Jump to navigation
Jump to search
An ML Evaluation Error is a model evaluation error type that is a machine learning evaluation error occurring during machine learning model evaluation that leads to misleading performance metrics.
- AKA: Machine Learning Evaluation Mistake, ML Assessment Error, Model Evaluation Flaw.
- Context:
- It can typically cause Inflated Performance Metrics through improper evaluation procedures and biased assessment.
- It can typically result from Data Leakage between training sets and test sets.
- It can typically arise from Improper Cross-Validation using incorrect splitting strategy.
- It can typically occur through Selection Bias in dataset construction and sample selection.
- It can typically manifest in Overfitting to Test Set via repeated evaluation and hyperparameter tuning.
- ...
- It can often involve Temporal Leakage when future information influences past predictions.
- It can often include Label Leakage where target variables contaminate features.
- It can often feature Distribution Shift between training distribution and deployment distribution.
- It can often exhibit Cherry-Picking Results through selective reporting and p-hacking.
- ...
- It can range from being a Subtle ML Evaluation Error to being an Obvious ML Evaluation Error, depending on its detection difficulty.
- It can range from being a Data-Related ML Evaluation Error to being a Methodology-Related ML Evaluation Error, depending on its error source.
- It can range from being a Minor ML Evaluation Error to being a Critical ML Evaluation Error, depending on its impact severity.
- It can range from being a Common ML Evaluation Error to being a Rare ML Evaluation Error, depending on its occurrence frequency.
- It can range from being a Preventable ML Evaluation Error to being an Inherent ML Evaluation Error, depending on its avoidability.
- ...
- It can undermine Model Reliability in production environments.
- It can mislead Model Selection Decisions through false performance indicators.
- It can affect Research Reproducibility via unreliable results.
- It can impact Business Decisions based on flawed evaluation.
- It can necessitate Evaluation Protocol Revision for proper assessment.
- ...
- Example(s):
- Data Leakage Errors, such as:
- Train-Test Leakage where training data appears in test set.
- Feature Leakage where future information contaminates input features.
- Target Leakage where label information influences prediction variables.
- Preprocessing Leakage where data normalization uses test set statistics.
- Temporal Pitfalls, such as:
- Temporal Data Leakage using future data for past prediction.
- Time Series Split Error ignoring temporal ordering in cross-validation.
- Lookahead Bias incorporating future knowledge in historical analysis.
- Sampling Pitfalls, such as:
- Sample Selection Bias with non-representative samples.
- Survivorship Bias excluding failed instances from evaluation.
- Class Imbalance Ignorance using accuracy metric on imbalanced datasets.
- Validation Pitfalls, such as:
- ...
- Data Leakage Errors, such as:
- Counter-Example(s):
- Proper Evaluation Practices, which follow rigorous methodology and avoid evaluation errors.
- Statistical Assumption Violations, which are theoretical issues rather than evaluation mistakes.
- Model Architecture Flaws, which are design problems rather than evaluation errors.
- See: Machine Learning Task, Machine Learning Evaluation, Model Validation, Cross-Validation, Machine Learning Data Leakage, Overfitting, Selection Bias.