Scheduled for Research Consortium Poster Social: Representative Research in HPERD, Wednesday, March 14, 2007, 4:30 PM - 6:00 PM, Convention Center: Exhibit Hall Poster Area I


Coder Agreement in Scoring Physical Activity Diary Data Using the Compendium

Youngsik Park, Weimo Zhu and Yong Gao, University of Illinois at Urbana-Champaign, Urbana, IL

The compendium of physical activities (Ainsworth et al., 1993, 2000) has often been used to score dairy and/or questionnaire data, but some critical coder agreement issues, such as agreement of multiple coders and impact of the coding scale/system, have not yet been addressed. The purpose of this study was to examine multiple-coder agreement in scoring physical activity data using the Compendium. Six coders were recruited for the study. After they were trained to code diaries using the Compendium, they were asked to code a culled 200-case diary. This training diary was created from a large diary database (N = 22,371 cases) and had been rated by a group panel. The coders' agreements with the panel were then computed and another 200-case dataset would be used if the agreement was lower than 80%. All of them passed 200-case coding training. They were then asked to code a 500-case dataset, which was developed from 5 subjects' diaries with 100 cases drawn randomly from each. Each case was rated by the coders and their agreements by a specific activity (called “description” in the compendium) and by a category (called “heading”) were computed using Fleiss' generalized Kappa (Fleiss, 1971). A total of 106 specific activity and 15 category labels were employed in rating the diary cases. The overall agreement (Fleiss' Kappa) for specific activity was .62 and, as expected, it increased to .80 for activity category. Two of the coders agreed more with each other (Kappa = .90 for category) than with others (Kappa = .70-.78). Further activity-based analyses indicated that the low-agreement was caused mainly by two “error” resources: (a) less popular activities, which may be less familiar to the coders and (b) activities that are not exclusive to each other due to the similar label descriptions. For example, low agreements were found in several activities appearing in less than 1% of the time: music playing (Kappa = 0.39) and religious activities (Kappa = 0.51). Low agreements also tended to happen when more than two activity labels could be used classify an activity. For example, there are two very similar activity categories for cleaning: “cleaning, house or cabin, general” (Kappa =.45) and “cleaning, light (dusting, straightening up, changing linen, carrying out trash)” (Kappa =.18). It was concluded that satisfactory coder-agreement can be achieved with appropriate training and coder agreement could be impacted by the characteristics of the employed coding system.
Keyword(s): exercise/fitness/physical activity, measurement/evaluation, technology

Back to the 2007 AAHPERD National Convention and Exposition (March 13 -- 17, 2007)