A Measurement Decision Theory Tutorial

Measurement Decision Theory
Lawrence M. Rudner
Independent Consultant

Model Overview
	Intro and Need Theory Decision Rules Adaptive testing Sequential Decisions Discussion References
Model Evaluation
	Simulation Study
Resources
	Bayesian Networks and Decision-Theoretic Reasoning for Artificial Intelligence (powerpoint slides, Microsoft) Tutorial on Learning With Bayesian Networks (pdf, Microsoft)
Papers
	Measurement Decision Theory by L. Rudner (me) An Examination of Decision-Theory Adaptive Testing Procedures L. Rudner AERA 2002 Accuracy of Decision Theory L. Rudner (NCME 2003) An Overview of Some Recent Developments in Bayesian Problem Solving Techniques by P. Haddaway Decision Theory in Expert Systems and Artificial Intelligence by E. Horvitz & others Why I am not a Bayesian by K. Burdzy Search ResearchIndex
Research Communities
	Uncertainty and Artificial Intelligence (UAI) American Association for Artificial Intelligence (AAAI) Decision Analysis Society of INFORMS International Society for Bayesian Analysis (ISBA) Data Mining and Knowledge Discovery in Databases (KDD) ACM SIGKDD: Special Interest Group on Knowledge Discovery and Data Mining Society for Medical Decision Making (SMDM)

Introduction

Advocated by Wald (1947), first applied to measurement by Cronbach and Gleser (1957), and now widely used in engineering, agriculture, and computing, decision theory provides a simple model for the analysis of categorical data. It is most applicable in measurement when the goal is to classify examinees into one of two categories, e.g. pass/fail or master/non-master.

From pilot testing, one estimates

The proportion of master and non masters in the population, and

The conditional probabilities of examinees in each mastery state responding correctly to each item.

After the test is administered, one can compute (based on the examinee's responses and the pilot data):

The likelihood of an examinee's response pattern for masters and for non-masters

The probability that the examinee is a master and the probability that the examinee is a non-master.

This tutorial provides an overview of measurement decision theory. Key concepts are presented and illustrated using a binary classification (pass/fail) test and a sample three-item test.

The Excel tool allows you to vary the results of the pilot, the examinee's response pattern, and the cost structure. Various rules for classifying an examinee are then presented along with the underlying calculations. Bayes cost was added in July 2015.

Need

Classical measurement theory and item response theory are concerned primarily with rank ordering examinees across an ability continuum. Those models are concerned, for example, with differentiating examinees at the 90^th and 92^nd percentiles. But one is often interested in classifying examinees into one of a finite number of discrete categories, such as pass/fail or proficient/basic/below-basic. This is a simpler outcome and a simpler measurement model should suffice. Measurement Decision Theory is one such simpler tool.

Measurement decision theory requires only one key assumption - that the items are independent. Thus, the tested domain does not need to be unidimensional, examinee ability does not need to be normally distributed, and one doesn’t need to be concerned with the fit of the data to a theoretical model as in item response theory (IRT) or in most latent class models. The model is attractive as the routing mechanism for intelligent tutoring systems, for end-of-unit examinations, for adaptive testing, and as a means of quickly obtaining the classification proportions on other examinations. Very few pilot test examinees are needed and, with very few items, classification accuracy can exceed that of item response theory. Given these attractive features, it is surprising that the model has not attracted wider attention within the measurement community.

Isolated elements of decision theory have appeared sporadically in the measurement literature. Key articles in the mastery testing literature of the 1970s employed decision theory (Hambleton and Novick, 1973; Huynh, 1976; van der Linden and Mellenbergh, 1977) and should be re-examined in light of today’s measurement problems. Lewis and Sheehan (1990) and others used decision theory to adaptively select items. Kingsbury and Weiss (1983), Reckase (1983), and Spray and Reckase (1996) have used decision theory to determine when to stop testing. Most of the research to date has applied decision theory to testlets or test batteries or as a supplement to item response theory and specific latent class models. Notable articles by Macready and Dayton (1992), Vos (1997), and Welch and Frick (1993) illustrate the less prevalent item-level application of decision theory examined in this tutorial.