Wilson Score F1 Confidence Interval Method
Jump to navigation
Jump to search
A Wilson Score F1 Confidence Interval Method is a confidence interval construction method that produces asymmetric confidence bounds for F1 scores using plus-four adjustments and score-based inversion to respect the [0,1] constraint.
- AKA: Wilson-Style F1 Interval Method, Plus-Four F1 Confidence Interval, Score-Based F1 Interval Method, Asymmetric F1 Bounds Method.
- Context:
- It can typically apply (TP+2)/(TP+FN+4) adjustment for recall estimates and similar for precision estimates.
- It can typically produce asymmetric intervals that remain within [0,1] bounds unlike Wald intervals.
- It can typically achieve better coverage probability than symmetric intervals in small samples.
- It can often shift the interval center away from the point estimate to improve coverage.
- It can often provide wider, more conservative intervals near boundaries (0 or 1).
- It can often outperform Delta-Method F1 Standard Error Estimation Method intervals for n < 30.
- It can range from being a Standard Wilson Score F1 Confidence Interval Method to being an Adjusted Wilson Score F1 Confidence Interval Method, depending on its continuity correction.
- It can range from being a Binary Wilson Score F1 Confidence Interval Method to being a Multi-Class Wilson Score F1 Confidence Interval Method, depending on its classification scope.
- It can range from being a Conservative Wilson Score F1 Confidence Interval Method to being a Exact Wilson Score F1 Confidence Interval Method, depending on its coverage guarantee.
- It can range from being a Analytical Wilson Score F1 Confidence Interval Method to being a Numerical Wilson Score F1 Confidence Interval Method, depending on its computation approach.
- ...
- Example(s):
- Standard Plus-Four Adjustments, such as:
- Original: TP=10, FP=2, FN=3; Adjusted: TP'=12, FP'=4, FN'=5.
- Recall adjustment: (10+2)/(10+3+4) = 0.706 vs raw 0.769.
- Resulting F1 CI: [0.65, 0.85] asymmetric around 0.80.
- Small Sample Applications, such as:
- n=20 samples: Wilson CI [0.42, 0.78] vs Wald CI [0.35, 0.85].
- Better 95% coverage: Wilson 94.2% vs Wald 87.5% actual.
- Boundary Behaviors, such as:
- F1 near 1.0: Wilson [0.92, 0.98] vs invalid Wald [0.95, 1.03].
- F1 near 0.0: Wilson [0.01, 0.15] vs invalid Wald [-0.05, 0.10].
- ...
- Standard Plus-Four Adjustments, such as:
- Counter-Example(s):
- Wald F1 Confidence Interval Method, which produces symmetric intervals.
- Clopper-Pearson Exact Interval Method, which is even more conservative.
- Bootstrap Percentile Interval Method, which uses resampling.
- See: Confidence Interval Construction Method, Wilson Score Interval, F1 Score, Plus-Four Adjustment, Continuity Correction in Performance Measure Method, Delta-Method F1 Standard Error Estimation Method, Effective Sample Size Wilson F1 Method, Wilson with Continuity Correction F1 CI Method, Small Sample Inference, Asymmetric Confidence Interval, Coverage Probability, Score Test, Binomial Proportion Interval, Agresti-Coull F1 Confidence Interval Method.