Liberal-Conservative Hierarchies of Intercoder Reliability Indices for Criminological Research

Shuhuan Zeng; Dianshi Moses Li; Guangchao Charles Feng; Song Harris Ao; Ming Milano Li; Hui Huang; Ke Deng; Zhujin Zhang; Xinshu Zhao

doi:10.69689/gzd30c10

Authors

Ke Deng, Department of Statistics & Data Science, Tsinghua University, Beijing, China

.
Zhujin Zhang, Faculty of Law, University of Macau, Macau SAR, China

.
Xinshu Zhao, Professor Emeritus, UNC Hussman School of Journalism and Media, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

.

DOI:

https://doi.org/10.69689/gzd30c10

Articles | Published Date: 2026-05-01 | Access to Full Text: PDF | Access to Full Text: HTML | Vol. 3 No. 1 (2026)

Keywords:

intercoder reliability, interrater reliability, qualitative criminology, criminal justice methods, Cohen's κ, Krippendorff’s α

Abstract

Criminological research depends extensively on human coding. Interviews, narratives, case files, visual records, and open-source materials are routinely converted into analyzable data through coding decisions made by researchers and research assistants. Yet criminological studies report intercoder and interrater reliability inconsistently. Even when reliability is reported, it is often assessed using heterogeneous coefficients that are treated as directly comparable, despite resting on different assumptions and lacking interchangeability on a common scale. The present study does not offer a procedural guide or prescribe a universal coefficient for criminological research. Instead, it identifies the liberal–conservative ordering of reliability estimators and examines how that ordering can inform coefficient selection and interpretation in human-coded research. We first extend previous mathematics-based hierarchies to include 23 indices. We then use Monte Carlo simulations to construct six additional hierarchies under varying conditions of category number, sample size and distributional skew. Across eight hierarchies, a consistent pattern emerges, together with a previously undetected paradox in Perreault and Leigh’s Ir. The resulting hierarchies provide criminological researchers with a principled basis for choosing among non-equivalent reliability estimators and for interpreting reported coefficients more cautiously.

Download PDF Download PDF

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.