Scheduled for Free Communication: Measurement Research in Physical Activity and Physical Education, Thursday, April 2, 2009, 8:45 AM - 10:00 AM, Tampa Convention Center: 7-8


PEMetrics Standard 1, Elementary: Development of an Assessment Item/Task Bank

Judith Placek1, Connie Fox2, Weimo Zhu3, Kim C. Graber4, Jennifer L. Fisette5, Marybell Avery6, Marian Franck7, Ben Dyson8 and Youngsik Park4, (1)University of Massachusetts, retired, Holliston, MA, (2)Northern Illinois University, De Kalb, IL, (3)University of Illinois at UrbanaChampaign, Urbana, IL, (4)University of Illinois at UrbanaChampaign, Urbana, IL, (5)Kent State University, Kent, OH, (6)Lincoln Public Schools, Lincoln, NE, (7)National Association for Sport and Physical Education, Reston, VA, (8)University of Memphis, Memphis, TN

Item bank, in which a set of items is calibrated on the same scale, is a modern test construction practice with several measurement advantages (e.g., assessment is invariant to items selected). Except for FitSmart (Zhu et al., 1999), the field of physical education (PE) has not taken the advantages of this practice. To assess the National Standards for PE (NASPE, 2004),- Elementary Standard 1, an item/task bank called “PE Metrics” was developed by NASPE.

Purpose

This study reported technical details of the development and calibration of the bank.

Methods

A total of 30 tasks and related scoring rubrics were developed for Kindergarten (K), Grades 2 (G2) and 5 (G5), respectively:

K – Underhand Catching, Dribble with Hand (C = Common Task), Hopping (C), Running, Sliding, Striking, Underhand Throw, Weight Transfer

G2 – Approach & Kick a Ball, Dance Sequence, Dribble with Jog (C), Galloping, Gymnastics Sequence, Jumping & Landing Combination, Jump forward (C), Locomotor Sequence, Overhand Catching, Skipping, Striking with Paddle

G5 – Basketball: Dribble, Pass and Receive; Defense; Offence; Dance, Floor Hockey, Gymnastics, Inline Skating, Overhand throwing, Soccer: Dribble, Pass, and Receive (C); Offense; Striking Ball with Paddle (C).

After several pilots and revisions, the tasks were administered to a national sample of students (N = 4,956, 2,501 males and 2,385 females, from 57 schools; K = 1,488, G2 = 1,907, & G5 = 1,563). While the common items, which are used to link all items on the same scale, were administered to all students in the same grade, non-common items were administered only to selected subsamples.

Analysis/Results

The collected data were screened using descriptive, items analysis and outlier statistics. The cleaned data were analyzed by the Rasch rating scale model (Wright & Masters, 1982) using FACETS, a Rasch analysis software. The model-data fit was evaluated using Infit and Outfit statistics (between .7 and 1.3) and task categorization was examined using related statistics (e.g., average measures). The tasks fit the model well according to Infit and Outfit statistics. Assessment task difficulties were well spread in a large of range, e.g., the most difficult tasks in K are Dribble with Hand and Striking (logits = .73), the most easy one is Underhand Catching (-1.12), and scoring rubrics difficulties ranged from -1.44 to 1.16. Categorization statistics indicated that most of tasks were well developed, with a good discrimination.

Conclusions

PE Metrics is ready to measure students' achievement and set an excellent example for future test construction in PE.


Keyword(s): assessment, motor skills, physical education PK-12

Back to the 2009 AAHPERD National Convention and Exposition (March 31 - April 4, 2009)