Scheduled for Computerized Adaptive Testing, Friday, March 16, 2007, 10:15 AM - 12:15 PM, Convention Center: 328


Principle and Practice of Computerized Adaptive Testing: An Overview

Weimo Zhu, University of Illinois at Urbana-Champaign, Urbana, IL

An adaptive testing strategy is like a high-jump competition, since the height that competitors choose depends on their ability. In adaptive testing, a question/item/task is first asked in the middle of the prospective ability range. If it is answered correctly the next question is more difficult; but if answered incorrectly it is easier. This continues until establishing the examinee's proficiency within some predetermined level of accuracy. With the aid of computerization, adaptive testing is conducted interactively. The examinee's ability/proficiency is estimated after the completion of each item and the next item with a more appropriate difficulty will be selected. The final test score can be reported immediately after examination. The architecture of a CAT system usually has the following components: Calibrated item pool, test algorithm, delivery system and score reporting. The calibrated item pool, known as item banks, include a large number of items with known item psychometric characteristics (e.g., difficulty and discrimination) and their content representations. Item characteristics are usually determined by an item response theory model. A test algorithm typically involves three decision components: (a) where to start the test, (b) how to continue, and (3) when to end. In the past, computer and local network have been the delivery system, but there is a fast-growing trend to use the internet to deliver CAT. Finally, score reporting involves both the scoring mechanism and the results reporting to the examinees. A number of commercial CAT software have been developed and a few are being extended to internet and PDA-based CAT applications. Compared to traditional paper-and-pencil linear testing, CAT has three major advantages: (a) Efficiency (e.g., 50% or more reduction in test length can be achieved while maintaining the same accuracy), (b) Broad range of measurement, but better local precision (i.e., able to select items to meet examinees' level, since items at every level have to be prepared), and (c) Increased security because of limited exposure to all test items. The major limitations of CAT are the needed high technical requirements and expensive resources. Except for a few attempts to explore CAT and its application (e.g., Zhu, 1992; Rimmer et al., 2004), CAT has been basically ignored in the field of kinesiology and physical education. Lack of training, technical support, resources and successful examples may be the major reasons. Efforts to overcome these barriers are urgently needed.
Keyword(s): assessment, exercise/fitness/physical activity, measurement/evaluation

Back to the 2007 AAHPERD National Convention and Exposition (March 13 -- 17, 2007)