A widely accepted approach to evaluate interrater reliability for categorical responses involves the rating of n subjects by at least 2 raters.
Frequently, there are only 2 response categories, such as positive or negative diagnosis.
The same approach is commonly used to assess the concordant classification by 2 diagnostic methods.
Depending on whether one uses the percent agreement as such or corrected for that expected by chance, i.e. Cohen's kappa coefficient, one can get quite different values.
This short communication demonstrates that Cohen's kappa coefficient of agreement between 2 raters or 2 diagnostic methods based on binary (yes/no) responses does not parallel the percentage of patients with congruent classifications.
Therefore, it may be of limited value in the assessment of increases in the interrater reliability due to an improved diagnostic method.
The percentage of patients with congruent classifications is of easier clinical interpretation, however, does not account for the percent of agreement expected by chance.
We, therefore, recommend to present both, the percentage of patients with congruent classifications, and Cohen's kappa coefficient with 95% confidence limits.
Mots-clés Pascal : Biométrie, Comparaison interindividuelle, Méthode étude, Etude critique, Indice kappa, Fiabilité, Classification, Donnée binaire, Diagnostic, Maladie, Interprétation information
Mots-clés Pascal anglais : Biometrics, Interindividual comparison, Investigation method, Critical study, Kappa number, Reliability, Classification, Binary data, Diagnosis, Disease, Information interpretation
Notice produite par :
Inist-CNRS - Institut de l'Information Scientifique et Technique
Cote : 97-0234201
Code Inist : 002B30A01A2. Création : 11/06/1997.