Measurement Decision Theory | ||
IntroductionAdvocated by Wald (1947), first applied to measurement by Cronbach and Gleser (1957), and now widely used in engineering, agriculture, and computing, decision theory provides a simple model for the analysis of categorical data. It is most applicable in measurement when the goal is to classify examinees into one of two categories, e.g. pass/fail or master/non-master. From pilot testing, one estimates
After the test is administered, one can compute (based on the examinee's responses and the pilot data):
This tutorial provides an overview of measurement decision theory. Key concepts are presented and illustrated using a binary classification (pass/fail) test and a sample three-item test. The Excel tool allows you to vary the results of the pilot, the examinee's response pattern, and the cost structure. Various rules for classifying an examinee are then presented along with the underlying calculations. Bayes cost was added in July 2015. NeedClassical measurement theory and item response theory are concerned primarily with rank ordering examinees across an ability continuum. Those models are concerned, for example, with differentiating examinees at the 90th and 92nd percentiles. But one is often interested in classifying examinees into one of a finite number of discrete categories, such as pass/fail or proficient/basic/below-basic. This is a simpler outcome and a simpler measurement model should suffice. Measurement Decision Theory is one such simpler tool. Measurement decision theory requires only one key assumption - that the items are independent. Thus, the tested domain does not need to be unidimensional, examinee ability does not need to be normally distributed, and one doesn’t need to be concerned with the fit of the data to a theoretical model as in item response theory (IRT) or in most latent class models. The model is attractive as the routing mechanism for intelligent tutoring systems, for end-of-unit examinations, for adaptive testing, and as a means of quickly obtaining the classification proportions on other examinations. Very few pilot test examinees are needed and, with very few items, classification accuracy can exceed that of item response theory. Given these attractive features, it is surprising that the model has not attracted wider attention within the measurement community. Isolated elements of decision theory have appeared sporadically in the measurement literature. Key articles in the mastery testing literature of the 1970s employed decision theory (Hambleton and Novick, 1973; Huynh, 1976; van der Linden and Mellenbergh, 1977) and should be re-examined in light of today’s measurement problems. Lewis and Sheehan (1990) and others used decision theory to adaptively select items. Kingsbury and Weiss (1983), Reckase (1983), and Spray and Reckase (1996) have used decision theory to determine when to stop testing. Most of the research to date has applied decision theory to testlets or test batteries or as a supplement to item response theory and specific latent class models. Notable articles by Macready and Dayton (1992), Vos (1997), and Welch and Frick (1993) illustrate the less prevalent item-level application of decision theory examined in this tutorial.
|