Measure of Agreement

From GM-RKB
(Redirected from Inter-rater reliability)
Jump to navigation Jump to search

A Measure of Agreement is a performance measure for a multi-agent prediction tasks between different predictors.



References

2022a

  • (Wikipedia, 2022) ⇒ https://en.wikipedia.org/wiki/Inter-rater_reliability Retrieved:2022-3-20.
    • In statistics, inter-rater reliability (also called by various similar names, such as inter-rater agreement, inter-rater concordance, inter-observer reliability, inter-coder reliability, and so on) is the degree of agreement among independent observers who rate, code, or assess the same phenomenon.

      Assessment tools that rely on ratings must exhibit good inter-rater reliability, otherwise they are not valid tests.

      There are a number of statistics that can be used to determine inter-rater reliability. Different statistics are appropriate for different types of measurement. Some options are joint-probability of agreement, such as Cohen's kappa, Scott's pi and Fleiss' kappa; or inter-rater correlation, concordance correlation coefficient, intra-class correlation, and Krippendorff's alpha.

2022b

2008

1993

  • (James et al., 1993) ⇒ Lawrence R. James, Robert G. Demaree, and Gerrit Wolf. (1993). “rwg: An assessment of within-group interrater agreement.” In: Journal of Applied Psychology, 78(2).
    • QUOTE: F. L. Schmidt and J. E. Hunter (1989) critiqued the within-group interrater reliability statistic (rwg) described by L. R. James et al (1984). S. W. Kozlowski and K. Hattrup (1992) responded to the Schmidt and Hunter critique and argued that rwg is a suitable index of interrater agreement. This article focuses on the interpretation of rwg as a measure of agreement among judges' ratings of a single target. A new derivation of rwg is given that underscores this interpretation.