Macro-F1 P-Value Calculation Method
(Redirected from Multi-Class F1 Significance Test)
Jump to navigation
Jump to search
A Macro-F1 P-Value Calculation Method is a p-value calculation method that tests macro-F1 averages using aggregated group standard errors under independent groups assumptions.
- AKA: Macro-Averaged F1 Significance Test, Group-Averaged F1 P-Value Method, Macro F1 Hypothesis Test, Multi-Class F1 Significance Test.
- Context:
- It can typically aggregate group-level variances using Independent Groups Assumption in Variance Estimation Method.
- It can typically compute overall standard error from per-group F1 standard errors.
- It can typically apply Delta-Method F1 Standard Error Estimation Method to each classification group.
- It can often test against multi-class null hypothesis values.
- It can often support Macro-F1 Difference P-Value Methods for model comparisons.
- It can often handle unbalanced group sizes through variance weighting.
- It can often be extended using Multivariate Delta Method for Macro-F1 Variance Method for correlated classes.
- It can range from being a Simple Macro-F1 P-Value Calculation Method to being a Weighted Macro-F1 P-Value Calculation Method, depending on its group weighting scheme.
- It can range from being a Homoscedastic Macro-F1 P-Value Calculation Method to being a Heteroscedastic Macro-F1 P-Value Calculation Method, depending on its variance assumption.
- It can range from being a Independent Macro-F1 P-Value Calculation Method to being a Dependent Macro-F1 P-Value Calculation Method, depending on its group correlation handling.
- It can range from being a Fixed-Groups Macro-F1 P-Value Calculation Method to being a Random-Groups Macro-F1 P-Value Calculation Method, depending on its group effect model.
- ...
- Example(s):
- Three-Class Macro-F1 Tests, such as:
- Macro-F1=0.781 vs null=0.6.
- Combined SE from three group SEs.
- Multi-Label Macro-F1 Tests, such as:
- Testing 20-class document classifier.
- Aggregating variances across all labels.
- Cross-Validation Macro-F1 Tests, such as:
- Testing average macro-F1 across folds.
- Accounting for fold correlation.
- ...
- Three-Class Macro-F1 Tests, such as:
- Counter-Example(s):
- Micro-F1 P-Value Method, which pools counts before testing.
- Per-Class F1 P-Value Method, which tests each class separately.
- Bootstrap Macro-F1 P-Value Method, which uses resampling.
- See: Macro-F1 Measure from Group Counts Method, P-Value Calculation Method, Independent Groups Assumption in Variance Estimation Method, Delta-Method F1 Standard Error Estimation Method, Variance Aggregation Method, Group-Level Variance, Statistical Hypothesis Testing, Macro-F1 Difference P-Value Method, Multi-Class Classification Evaluation, Macro-Averaged Performance Measure.