The use of (weighted) κ is currently the standard method of reporting, and as such there should not be any controversy.
1 The weighted κ is preferred when the data are ordinally categorized, which is not the case in our study, allowing use of the Fleiss κ, which suitable for our data.
Measures of interrater agreement.
The limitations of κ statistics in relation to prevalence and number of categories are well known.
- de Vet H.C.W.
- Mokkink L.B.
- Terwee C.B.
- Hoekstra O.S.
- Knol D.L.
Clinicians are right not to like Cohen’s κ.
However, we take issue with his statement “They concluded that the diagnosis using hematoxylin and eosin staining alone showed moderate agreement among pathologists in tumors with neuroendocrine morphologic features, but agreement improved to good in most cases with the judicious use of [immunohistochemistry], especially in the diagnosis of SCLC. Such conclusion may be a misleading message on account of inappropriate use of a statistical test. Briefly, for reliability analysis, appropriate tests should be applied.”
First of all, in our article,
- Thunnissen E.
- Borczuk A.C.
- Flieder D.B.
- et al.
The use of immunohistochemistry improves the diagnosis of small cell lung cancer and its differential diagnosis. An international reproducibility study in a demanding set of cases.
the κ scores cited in Table 2 for the categories (combined) SCLC, large cell neuroendocrine carcinoma, atypical carcinoids, typical carcinoids, carcinoids (typical and atypical), poorly differentiated NSCLC, small round cell sarcoma, non-Hodgkin’s lymphoma, and other, which ranged from 0.05 to 0.81, were calculated over two classes (specific diagnosis versus other). Thus, one of the two aforementioned limitations of the κ scores, namely, lower κ values when more categories are used, is not applicable. Also in that table, κ scores over four (nonordinal) categories are calculated, leading to the same outcome. In addition, the article states that each of the cited κ values represents the mean value of 171 comparisons—with 19 observers and (19*18)/2 combinations—for each diagnostic category, which is a high enough number to exclude large variations due to possible differences in prevalence. Therefore, the κ outcome measures in our study are based on sound application of κ statistics and thus not inappropriately used. In addition, the obtained κ values are, where available, in line with the literature.
- Ha S.Y.
- Han J.
- Kim W.-S.
- Suh B.S.
- Roh M.S.
Interobserver variability in diagnosing high-grade neuroendocrine carcinoma of the lung and comparing it with the morphometric analysis.
- den Bakker M.A.
- Willemsen S.
- Grünberg K.
- et al.
Small cell carcinoma of the lung and large cell neuroendocrine carcinoma interobserver variability.
- Travis W.D.
- Gal A.A.
- Colby T.V.
- Klimstra D.S.
- Falk R.
- Koss M.N.
Reproducibility of neuroendocrine lung tumor classification.
As for the content, the primary objective of our study was to test the hypothesis that the use of immunohistochemistry (IHC) leads to greater diagnostic reproducibility when distinguishing SCLC from its differential diagnoses. This led to moderate κ scores in SCLC, and therefore, the participating pathologists advocated the use of additional stains, resulting in increased κ values for several categories (again based on 171 comparisons). As a take-home message we stated, “In conclusion, a [hematoxylin and eosin] diagnosis of SCLC or other pulmonary neuroendocrine tumor is relatively straightforward, but IHC improves diagnostic reproducibility. IHC can aid the pathologist in cases where histologic features are considered equivocal, or in cases where the pathologist is looking for additional support.” We believe that our conclusion is based on appropriate use of κ statistics that supports the use IHC, especially in the diagnosis of SCLC, leading to better management of patients in routine clinical care.
Disclosure: The authors declare no conflict of interest.
© 2017 International Association for the Study of Lung Cancer. Published by Elsevier Inc.