Scheduled for Fairness and Optimization in Sports: Can We Do Better?, Wednesday, April 13, 2005, 3:15 PM - 5:15 PM, Convention Center: E270


Evaluating Judges’ Scoring Patterns in Sport: Observed Versus Expected Scores

Marilyn A. Looney, Northern Illinois University, DeKalb, IL

Judges’ scores are used in many sports, such as, gymnastics, diving, ski jumping, and figure skating, to determine the winner of a competition. Using a panel of adequately trained and experienced judges, however, does not guarantee that the judges will score performances without bias. No panel of judges can ever be trained to agree perfectly on all observed performances, but each judge can be trained to become more internally consistent in using the scoring system (Linacre, 1998). Based on this principle, each judge should be evaluated to see if he or she has applied consistently his or her interpretation of the scoring system across all competitors and across each aspect of performance evaluated (Stahl & Lunz, 1996). The purpose of this presentation is to illustrate how results from a Rasch analysis can be used to provide in-depth feedback to sport governing bodies and judges about their scoring patterns. Nine judges’ scores for 20 pairs of figure skaters at the 2002 Winter Olympics were analyzed using a four-faceted (skater pair ability, skating aspect difficulty, program difficulty and judge severity) Rasch rating scale model that was not common to all judges. Because the data fit the expectations of the Rasch measurement model well, the pattern of the unexpected responses may be explained by some of the following reasons: (a) skating order effect, (b) bias (e.g., nationalistic), and (c) poor internal consistency in using the scoring system. Feedback for a subset of judges will be presented. This feedback includes a detailed description of how the rating scale was used (e.g., the American judge had the most erratic scoring pattern of all of the judges while the French judge’s marks were more predictable than the model expected), percent of all marks that were unexpected by the model (Z>|2|), and three figures illustrating differences between each judge’s observed and expected marks arranged according to the pairs’ skating order and final placement in the competition. Scores which may represent nationalistic bias or a skating order effect were flagged. Fairness in sport may be upheld if each judge is held accountable by sport governing bodies to apply consistently his or her interpretation of the scoring system across all competitors and across each aspect of performance evaluated.
Keyword(s): assessment, measurement/evaluation, olympic related

Back to the 2005 AAHPERD National Convention and Exposition