Scheduled for Measurement Symposium - New Perspective and Practice in Setting Performance Standards, Tuesday, April 1, 2003, 3:00 PM - 5:00 PM, Convention Center: 113C

The Angoff Method and its Extensions for Setting Cut-Off Scores

Yong Gao and Weimo Zhu, University of Illinois at Urbana-Champaign, Urbana, IL

The Angoff method is one of the most popular methods used for establishing valid defensible performance standards. William Angoff (1971) at the Educational Testing Service introduced this item-based approach to provide an objective rationale for a test¢s cut-off score. The purpose of this review is to outline the basic steps and current extensions of the Angoff Method. The primary steps of the method include selecting and training appropriate judges, and having them rate all items in a test. Judges should be content experts who are familiar with the examinees¢ abilities. Judges are also required to be able to conceptualize the ²minimally acceptable candidate² (MAC), defined as someone who adequately performs all necessary skills and abilities to hold the certification or job, and requires no further training. Each judge is asked to independently rate what the probability is that the MAC will answer a particular item correctly. After all judges rate every item, the probabilities are summed, divided by the number of judges, and multiplied by the number of items to achieve the cut-off score. This method allows various versions of the test to have different cut-off scores based on the difficulty of the particular version, resulting in fair and equal treatment to all test-takers. There have been many modifications to this basic approach, which addressed some criticisms that the judges¢ conceptualization of the MAC is inconsistent, and experts in one area may not be current with the entire scope of the test, which leads to the ratings having substantial variability. The modifications that have been used to improve the accuracy of this method include reviewing as group and reaching a consensus for each item. Reading and considering comments submitted by test-takers and examining and possibly eliminating ambiguous items with multiple good answers are other modifications. One of the easiest modifications is to provide the judges normative data (e.g., item difficulties) and mean ratings of each item based on all the judges and allowing them to change their initial ratings. Multiple round of this feedback and opportunities to change the ratings are used to decrease the variability of the ratings and reach a consensus. The primary benefit of this modification is that it does not require all judges to be located in the same place, allowing the number and quality of judges to increase. Overall, the Angoff method, or a modified version of it, remains very useful and popular for standard-setting.

Back to the 2003 AAHPERD National Convention and Exposition