Readings in ROC Analysis, with Emphasis on Medical Applications

Some Papers appear more than once because they belong to multiple classifications

Background

  • Egan JP. Signal detection theory and ROC analysis. New York: Academic Press, 1975.
  • Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med Decis Making 1991; 11: 88.
  • Green DM, Swets JA. Signal detection theory and psychophysics. New York, NY: Wiley, 1966.
  • Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and interpretation of diagnostic tests and procedures: principles and applications. Annals Int Med 1981; 94: 553.
  • International Commission on Radiation Units and Measurements. Medical imaging: the assessment of image quality (ICRU Report 54). Bethesda,MD: ICRU, 1996.
  • Lusted LB. Signal detectability and medical decision-making. Science 1971; 171: 1217.
  • McNeil BJ, Adelstein SJ. Determining the value of diagnostic and screening tests. J Nucl Med 1976; 17: 439.
  • McNeil BJ, Keeler E, Adelstein SJ. Primer on certain elements of medical decision making. New Engl J Med 1975; 293: 211.
  • Metz CE, Wagner RF, Doi K, Brown DG, Nishikawa RN, Myers KJ. Toward consensus on quantitative assessment of medical imaging systems. Med Phys 22: 1057-1061, 1995.
  • National Council on Radiation Protection and Measurements. An introduction to efficacy in diagnostic radiology and nuclear medicine (NCRP Commentary 13). Bethesda, MD: NCRP, 1995.
  • Robertson EA, Zweig MH, Van Steirtghem AC. Evaluating the clinical efficacy of laboratory tests. Am J Clin Path 1983; 79: 78.
  • Swets JA, Pickett RM, Whitehead SF, et al. Assessment of diagnostic technologies. Science 1979; 205:753–759.
  • Swets JA, Pickett RM. Evaluation of diagnostic systems: methods from signal detection theory. New York, NY: Academic Press, 1982.
  • Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Acad Radiol 2007; 14: 723–748.
  • Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clinical Chemistry 1993; 39: 561. [Erratum published in Clinical Chemistry 1993; 39: 1589.]

General
Books

  • Pepe MS. The statistical evaluation of medical tests for classification and prediction. Oxford ; New York: Oxford University Press, 2004.
  • Zhou X-H, Obuchowski NA, McClish DK. Statistical methods in diagnostic medicine. New York, NY: Wiley-Interscience, 2002

Articles

  • Hanley JA. Receiver operating characteristic (ROC) methodology: the state of the art. Critical Reviews in Diagnostic Imaging 1989; 29: 307.
  • International Commission on Radiation Units and Measurements. Receiver Operating Characteristic Analysis in Medical Imaging (ICRU Report 79). J ICRU 2008; 8:1–62.
  • King JL, Britton CA, Gur D, Rockette HE, Davis PL. On the validity of the continuous and discrete confidence rating scales in receiver operating characteristic studies. Invest Radiol 1993; 28: 962.
  • Metz CE. Basic principles of ROC analysis. Seminars in Nucl Med 1978; 8: 283.
  • Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986; 21: 720.
  • Metz CE. Some practical issues of experimental design and data analysis in radiological ROC studies. Invest Radiol 1989; 24: 234.
  • Metz CE. Evaluation of CAD methods. In Computer-Aided Diagnosis in Medical Imaging (K Doi, H MacMahon, ML Giger and KR Hoffmann, eds.). Amsterdam: Elsevier Science (Excerpta Medica International Congress Series, Vol. 1182), pp. 543-554, 1999.
  • Metz CE. Fundamental ROC analysis. In: Handbook of Medical Imaging, Vol. 1: Physics and Psychophysics (J Beutel, H Kundel and R Van Metter, eds.). Bellingham, WA; SPIE Press, 2000, pp. 751-769.
  • Metz CE. Receiver operating characteristic (ROC) analysis: a tool for quantitative evaluation of observer performance and imaging systems. JACR 3: 413-422, 2006
  • Metz CE, Shen J-H. Gains in accuracy from replicated readings of diagnostic images: prediction and assessment in terms of ROC analysis. Med Decis Making 1992; 12: 60.
  • Rockette HE, Gur D, Metz CE. The use of continuous and discrete confidence judgments in receiver operating characteristic studies of diagnostic imaging techniques. Invest Radiol 1992; 27: 169.
  • Swets JA. ROC analysis applied to the evaluation of medical imaging techniques. Invest Radiol 1979; 14: 109.
  • Swets JA. Indices of discrimination or diagnostic accuracy: their ROCs and implied models. Psychol Bull 1986; 99: 100.
  • Swets JA. Measuring the accuracy of diagnostic systems. Science 1988; 240: 1285.
  • Swets JA. Signal detection theory and ROC analysis in psychology and diagnostics: collected papers. Mahwah, NJ; Lawrence Erlbaum Associates, 1996.
  • Swets JA, Pickett RM. Evaluation of diagnostic systems: methods from signal detection theory. New York: Academic Press, 1982.
  • Wagner RF, Beiden SV, Metz CE. Continuous vs. categorical data for ROC analysis: Some quantitative considerations. Academic Radiol 2001, 8: 328, 2001.

Bias

  • Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics 1983; 39: 207.
  • Begg CB, McNeil BJ. Assessment of radiologic tests: control of bias and other design considerations. Radiology 1988; 167: 565.
  • Gray R, Begg CB, Greenes RA. Construction of receiver operating characteristic curves when disease verification is subject to selection bias. Med Decis Making 1984; 4: 151.
  • Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. New Engl J Med 1978; 299: 926.

Curve Fitting

  • Dorfman DD, Alf E. Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals — rating method data. J Math Psych 1969; 6: 487.
  • Dorfman DD, Berbaum KS, Metz CE, Lenth RV, Hanley JA, Dagga HA. Proper ROC analysis: the bigamma model. Academic Radiol 1997; 4: 138.
  • Dorfman DD, Berbaum KS. A contaminated binormal model for ROC data: Part II. A formal model. Acad Radiol 2000; 7:427-437.
  • Grey DR, Morgan BJT. Some aspects of ROC curve-fitting: normal and logistic models. J Math Psych 1972; 9: 128.
  • Hanley JA. The robustness of the “binormal” assumptions used in fitting ROC curves. Med Decis Making 1988; 8: 197.
  • Lloyd CJ. Estimation of a convex ROC curve. Stat Prob Lett 2002; 59: 99–111.
  • Metz CE, Herman BA, Shen J-H. Maximum-likelihood estimation of ROC curves from continuously-distributed data. Stat Med 1998; 17: 1033.
  • Metz CE, Pan X. “Proper” binormal ROC curves: theory and maximum-likelihood estimation. J Math Psych 1999; 43: 1.
  • Ogilvie J, Creelman CD. Maximum likelihood estimaton of receiver operating characteristic curve parameters. Journal of Mathematical Psychology. 1968;5:377-391
  • Pan X, Metz CE. The “proper” binormal model: parametric ROC curve estimation with degenerate data. Academic Radiol 1997; 4: 380.
  • Pesce LL, Metz CE. Reliable and computationally efficient maximum-likelihood estimation of “proper” binormal ROC curves. Acad Radiol. 2007;14(7):814-29
  • Swensson RG. Unified measurement of observer performance in detecting and localizing target objects on images. Med Phys 1996; 23: 1709.
  • Swets JA. Form of empirical ROCs in discrimination and diagnostic tasks: implications for theory and measurement of performance. Psychol Bull 1986; 99: 181.

Statistics

  • Multi-Case statistical analysis: only case variation considered
    Agresti A. A survey of models for repeated ordered categorical response data. Statistics in Medicine 1989; 8; 1209.
  • Bamber D. The area above the ordinal dominance graph and the area below the receiver operating graph. J Math Psych 1975; 12: 387.
  • Bandos AI, Rockette HE, Gur D. A permutation test sensitive to differences in areas for comparing ROC curves from a paired design. STATISTICS IN MEDICINE 24 (18): 2873-2893 SEP 30 2005
  • DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988; 44: 837.
  • Hajian-Tilaki KO, Hanley JA. Comparison of three methods for estimating the standard error of the area under the curve in ROC analysis of quantitative data. ACADEMIC RADIOLOGY 9 (11): 1278-1285 NOV 2002
  • Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143: 29.
  • Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 1983; 148: 839.
  • Jiang Y, Metz CE, Nishikawa RM. A receiver operating characterisitc partial area index for highly sensitive diagnostic tests. Radiology 1996; 201: 745.
  • Ma G, Hall WJ. Confidence bands for receiver operating characteristic curves. Med Decis Making 1993; 13: 191.
  • McClish DK. Analyzing a portion of the ROC curve. Med Decis Making 1989; 9: 190.
  • McClish DK. Determining a range of false-positive rates for which ROC curves differ. Med Decis Making 1990; 10: 283.
  • McNeil BJ, Hanley JA. Statistical approaches to the analysis of receiver operating characteristic (ROC) curves. Med Decis Making 1984; 4: 137.
  • Metz CE. Statistical analysis of ROC data in evaluating diagnostic performance. In: Multiple regression analysis: applications in the health sciences (D Herbert and R Myers, eds.). New York: American Institute of Physics, 1986, pp. 365.
  • Metz CE. Quantification of failure to demonstrate statistical significance: the usefulness of confidence intervals. Invest Radiol 1993; 28: 59.
  • Metz CE, Herman BA, Roe CA. Statistical comparison of two ROC curve estimates obtained from partially-paired datasets. Med Decis Making 1998; 18: 110.
  • Metz CE, Kronman HB. Statistical significance tests for binormal ROC curves. J Math Psych 1980; 22: 218.
  • Metz CE, Wang P-L, Kronman HB. A new approach for testing the significance of differences between ROC curves measured from correlated data. In: Information processing in medical imaging (F Deconinck, ed.). The Hague: Nijhoff, 1984, p. 432.
  • Thompson ML, Zucchini W. On the statistical analysis of ROC curves. Statistics in Medicine 1989; 8: 1277.
  • Wieand S, Gail MH, James BR, James KL. A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika 1989; 76: 585.
  • Zhou XH, Gatsonis CA. A simple method for comparing correlated ROC curves using incomplete data. Statistics in Medicine 1996; 15: 1687-1693.
  • Multi-Reader Multi-Case statistical analysis
    Bandos AI, Rockette HE, Gur D. A permutation test for comparing ROC curves in multireader studies ACADEMIC RADIOLOGY 13 (4): 414-420 APR 2006
  • Beiden SV, Wagner RF, Campbell G. Components-of-variance models and multiple-bootstrap experiments: and alternative method for random-effects, receiver operating characteristic analysis. Academic Radiol. 2000; 7: 341.
  • Beiden SV, Wagner RF, Campbell G, Metz CE, Jiang Y. Components-of-variance models for random-effects ROC analysis: The case of unequal variance structures across modalities. Academic Radiol. 2001; 8: 605.
  • Beiden SV, Wagner RF, Campbell G, Chan H-P. Analysis of uncertainties in estimates of components of variance in multivariate ROC analysis. Academic Radiol. 2001; 8: 616.
  • Dorfman DD, Berbaum KS, Metz CE. ROC rating analysis: generalization to the population of readers and cases with the jackknife method. Invest Radiol 1992; 27: 723.
  • Dorfman DD, Berbaum KS, Lenth RV, Chen Y-F, Donaghy BA. Monte Carlo validation of a multireader method for receiver operating characteristic discrtet rating data: factorial experimental design. Academic Radiol 1998; 5: 591.
  • Dorfman DD, Metz CE. Multi-reader multi-case ROC analysis: comments on Begg’s commentary. Academic Radiol 1995; 2 (Supplement 1): S76.
  • Gallas BD One-shot estimate of MRMC variance: AUC. ACADEMIC RADIOLOGY 13 (3): 353-362 MAR 2006
  • Hillis SL, Obuchowski NA, Schartz KM, Berbaum KS. A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods for receiver operating characteristic (ROC) data. Stat Med 2005; 24:1579-1607.
  • Hillis SL, Berbaum KS. Monte Carlo validation of the Dorfman-Berbaum-Metz method using normalized pseudovalues and less data-based model simplification. Academic Radiology 2005; 12:1534-1541.
  • Hillis SL, Berbaum KS Power estimation for the Dorfman-Berbaum-Metz method ACADEMIC RADIOLOGY 11 (11): 1260-1273 NOV 2004
  • Obuchowski NA. Multireader, multimodality receiver operating characteristic curve studies: hypothesis testing and sample size estimation using an analysis of variance approach with dependent observations. Academic Radiol 1995; 2 [Supplement 1]: S22.
  • Obuchowski, NA. Sample size calculations in studies of test accuracy. Stat Methods Med Res 1998; 7: 371.
  • Obuchowski NA, Beiden SV, Berbaum KS, et al. Multireader, multicase receiver operating characteristic analysis: An empirical comparsion of five methods ACADEMIC RADIOLOGY 11 (9): 980-995 SEP 2004
  • Rockette HE, Obuchowski N, Metz CE, Gur D. Statistical issues in ROC curve analysis. Proc SPIE 1990; 1234: 111.
  • Roe CA, Metz CE. The Dorfman-Berbaum-Metz method for statistical analysis of multi-reader, multi-modality ROC data: validation by computer simulation. Academic Radiol 1997; 4: 298.
  • Roe CA, Metz CE. Variance-component modeling in the analysis of receiver operating characteristic index estimates. Academic Radiol 1997; 4: 587.
  • Regression analysis of ROC curves Pepe MS. The statistical evaluation of medical tests for classification and prediction. Oxford ; New York: Oxford University Press, 2004.
  • Pepe MS, Cai TX. The analysis of placement values for evaluating discriminatory measures. BIOMETRICS 60 (2): 528-535 JUN 2004
  • Toledano A, Gatsonis CA. Regression analysis of correlated receiver operating characteristic data. Academic Radiol 1995; 2 [Supplement 1]: S30.
  • Toledano AY, Gatsonis C. Ordinal regression methodology for ROC curves derived from correlated data. Statistics in Medicine 1996, 15: 1807.
  • Toledano AY, Gatsonis C. GEEs for ordinal categorical data: arbitrary patterns of missing responses and missingness in a key covariate. Biometrics 1999; 22, 488.
  • Tosteson A, Begg C. A general regression methodology for ROC curve estimation. Med Decis Making 1988; 8: 204.

Relationships with Cost/Benefit Analysis

  • Halpern EJ, Alpert M, Krieger AM, Metz CE, Maidment AD. Comparisons of ROC curves on the basis of optimal operating points. Academic Radiology 1996; 3: 245-253.
  • Metz CE. Basic principles of ROC analysis. Seminars in Nucl Med 1978; 8: 283-298.
  • Metz CE, Starr SJ, Lusted LB, Rossmann K. Progress in evaluation of human observer visual detection performance using the ROC curve approach. In: Information Processing in Scintigraphy (C Raynaud and AE Todd-Pokropek, eds.). Orsay, France: Commissariat à l’Energie Atomique, Département de Biologie, Service Hospitalier Frédéric Joliot, 1975, p. 420.
  • Phelps CE, Mushlin AI. Focusing technology assessment. Med Decis Making 1988; 8: 279.
  • Sainfort F. Evaluation of medical technologies: a generalized ROC analysis. Med Decis Making 1991; 11: 208.
  • Wagner RE, Beam CA, Beiden SV. Reader variability in mammography and its implications for expected utility over the population of readers and cases. MEDICAL DECISION MAKING 24 (6): 561-572 NOV-DEC 2004

Generalizations

  • Anastasio MA, Kupinski MA, Nishikawa RN. Optimization and FROC analysis of rule-based detection schemes using a multiobjective approach. IEEE Trans Med Imaging 1998; 17: 1089
  • Bunch PC, Hamilton JF, Sanderson GK, Simmons AH. A free response approach to the measurement and characterization of radiographic observer performance. Proc SPIE 1997; 127: 124.
  • Chakraborty DP. Maximum likelihood analysis of free-response receiver operating characteristic (FROC) data. Med Phys 1989; 16: 561.
  • Chakraborty DP, Winter LHL. Free-response methodology: alternate analysis and a new observer-performance experiment. Radiology 1990; 174: 873.
  • Chakraborty DP, Berbaum KS. Observer studies involving detection and localization: Modeling, analysis and validation. Medical Physics 2004; 31:2313-2330.
  • Chakraborty DP. A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys. Med. Biol. 2006; 51:3449-3462.
  • Chakraborty DP. ROC Curves predicted by a model of visual search. Phys. Med. Biol. 2006; 51:3463-3482.
  • Edwards DC, Metz CE. Evaluating Bayesian ANN estimates of ideal observer decision variables by comparison with identity functions. Proc. SPIE 5749: 174-182, 2005.
  • Edwards DC, Metz CE. Optimization of an ROC hypersurface constructed only from an observer’s within-class sensitivities. Proc. SPIE 6146: 61460A1-61460A7, 2006.
  • Edwards DC, Metz CE. Analysis of proposed three-class classification decision rules in terms of the ideal observer decision rule. J. Math. Psych. (in press), 2006.
  • Egan JP, Greenberg GZ, Schulman AI. Operating characteristics, signal detection, and the method of free response. J Acoust Soc Am 1961; 33: 993.
  • HajianTilaki KO, Hanley JA, Joseph L, et al. Extension of receiver operating characteristic analysis to data concerning multiple signal detection tasks. ACADEMIC RADIOLOGY 4 (3): 222-229 MAR 1997
  • Metz CE, Starr SJ, Lusted LB. Observer performance in detecting multiple radiographic signals: prediction and analysis using a generalized ROC approach. Radiology 1976; 121: 337.
  • Obuchowski NA, Lieber ML, Powell KA.Data analysis for detection and localization of multiple abnormalities with application to mammography. ACADEMIC RADIOLOGY 7 (7): 516-525 JUL 2000
  • Starr SJ, Metz CE, Lusted LB, Goodenough DJ. Visual detection and localization of radiographic images. Radiology 1975; 116: 533.
  • Swensson RG. Unified measurement of observer performance in detecting and localizing target objects on images. Med Phys 1996; 23: 1709.

Papers related specifically to our Current Software
ROCKIT

  • Dorfman DD, Alf E. Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals — rating method data. J Math Psych 1969; 6: 487.
  • Metz CE, Herman BA, Shen J-H. Maximum-likelihood estimation of ROC curves from continuously-distributed data. Stat Med 1998; 17: 1033.
  • Metz CE, Herman BA, Roe CA. Statistical comparison of two ROC curve estimates obtained from partially-paired datasets. Med Decis Making 1998; 18: 110.
  • Metz CE. Statistical analysis of ROC data in evaluating diagnostic performance. In: Multiple regression analysis: applications in the health sciences (D Herbert and R Myers, eds.). New York: American Institute of Physics, 1986, pp. 365.
  • Metz CE. Quantification of failure to demonstrate statistical significance: the usefulness of confidence intervals. Invest Radiol 1993; 28: 59.

LABMRMC & MRMC

  • Dorfman DD, Berbaum KS, Metz CE. ROC rating analysis: generalization to the population of readers and cases with the jackknife method. Invest Radiol 1992; 27: 723.
  • Dorfman DD, Metz CE. Multi-reader multi-case ROC analysis: comments on Begg’s commentary. Academic Radiol 1995; 2 (Supplement 1): S76.
  • Roe CA, Metz CE. The Dorfman-Berbaum-Metz method for statistical analysis of multi-reader, multi-modality ROC data: validation by computer simulation. Academic Radiol 1997; 4: 298.
  • Roe CA, Metz CE. Variance-component modeling in the analysis of receiver operating characteristic index estimates. Academic Radiol 1997; 4: 587.
  • Hillis SL, Obuchowski NA, Schartz KM, Berbaum KS. A comparison of the Dorfman-Berbaum-Metz and Obuchowski-Rockette methods for receiver operating characteristic (ROC) data. Stat Med 2005; 24:1579-1607.
  • Hillis SL, Berbaum KS. Monte Carlo validation of the Dorfman-Berbaum-Metz method using normalized pseudovalues and less data-based model simplification. Academic Radiology 2005; 12:1534-1541.

LABROC4

  • Metz CE, Herman BA, Shen J-H. Maximum-likelihood estimation of ROC curves from continuously-distributed data. Stat Med 1998; 17: 1033.

ROCPWR

  • Metz CE, Wang P-L, Kronman HB. A new approach for testing the significance of differences between ROC curves measured from correlated data. In: Information processing in medical imaging (F Deconinck, ed.). The Hague: Nijhoff, 1984, p. 432.

PROPROC

  • Pan X, Metz CE. The “proper” binormal model: parametric ROC curve estimation with degenerate data. Academic Radiol 1997; 4: 380.
  • Metz CE, Pan X. “Proper” binormal ROC curves: theory and maximum-likelihood estimation. J Math Psych 1999; 43: 1.
  • Pesce LL, Metz CE. Reliable and computationally efficient maximum-likelihood estimation of “proper” binormal ROC curves. Acad Radiol 2007; 14:814–829.