Inter- and Intrarater Reliability in a Pedagogical Content Knowledge Measurement Tool

Wednesday, March 14, 2012
Poster Areas 1 and 2 (Foyer Outside Exhibit Hall C) (Convention Center)
Yun Soo Lee1, Youngdeok Kim1, Minsoo Kang1 and Elizabeth B. Sharp2, (1)Middle Tennessee State University, Murfreesboro, TN, (2)Colorado Mesa University, Grand Junction, CO

Background/Purpose It is important to measure pedagogical content knowledge (PCK) related to teacher effectiveness. A PCK measurement tool (Lee, 2011) has been introduced in physical education (PE) area; however, reliability of the measure has not been examined yet. Thus, the purpose of this study was to examine inter- and intra-rater reliability in the PCK measurement tool.

Method Ten PE classes were videotaped. Twenty trained raters were asked to score same classes twice with 3-day intervals to examine the reliability. This tool consists of ten items, which include verbal and visual representations, task appropriateness, and maturity of representations. Generalizability (G) theory approach was applied to estimate the relative magnitudes of errors sources associated with raters. PE teacher (p), trial (t), and rater (r) were considered random facets in a nested design of G-study (p x t: r). Follow up Decision (D)-studies were performed to estimate reliability coefficients (G-coefficients).

Analysis/Results The G-study showed that the PE teacher facet accounted for 23.4% of the variance. Rater facet was responsible for 28.2%, and the interaction between PE teachers and raters accounted for 10.2%. Variance components of trials within each rater explained 10.8% of the variance. The D-studies indicated that at least 7 raters are required to achieve a desired level of reliability (i.e., G = .80 or greater).

Conclusions The results showed that a large portion of error variances is attributable to the raters. The appropriate training length and content for the raters can be one of the solutions to improve feasibility of the PCK measurement tool.