Measuring School Leaders' Effectiveness: Final Report from a Multiyear Pilot of Pennsylvania's Framework for Leadership

Publisher: Washington, DC: U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, REL Mid-Atlantic
Jan 21, 2016
Moira McCullough, Stephen Lipscomb, Hanley Chiang, Brian Gill, and Irina Cheban

Key Findings:

  • Most school leaders received scores in the top two performance categories in each practice measured by the Framework for Leadership.
  • School leaders who received a higher score in one category of leadership practices tended to receive a higher score in the other categories.
  • School leaders’ scores in one year were moderately consistent (correlation coefficient of .54) with their scores in the next year.
  • Principals with larger estimated contributions to student achievement growth (value-added) scored higher overall and on multiple components and domains than principals with lower estimated contributions.
PA map This series of reports examines the accuracy of performance ratings from the Framework for Leadership (FFL), Pennsylvania’s tool for evaluating the leadership practices of principals and assistant principals. Four key properties of the FFL were analyzed: score variation, internal consistency, year-to-year stability, and concurrent validity. Score variation was characterized by the percentages of school leaders earning scores in different portions of the rating scale. To measure the internal consistency of the FFL, Cronbach’s alpha was calculated for the full FFL and for each of its four categories of leadership practices. Analyses of score stability used data on FFL scores of school years across two years to calculate Pearson’s correlation coefficient. Concurrent validity was assessed through a regression model for the relationship between school leaders’ estimated contributions to student achievement growth and their FFL scores. The first report examined data from the 2012/13 pilot year; the second report is based primarily on the 2013/14 pilot in which 517 principals and 123 assistant principals were rated by their supervisors. As a whole, the results indicate that the FFL is a reliable measure, with good internal consistency and a moderate level of year-to-year stability in scores. There is also evidence of the FFL’s concurrent validity: principals with higher scores on the FFL, on average, make larger estimated contributions to student achievement growth. Higher total FFL scores and scores in two of the four FFL domains are significantly or marginally significantly associated with both value-added in all subjects combined and value-added in math specifically. This evidence of the validity of the FFL sets it apart from other principal evaluation tools: No other measures of principals’ professional practice have been shown to be related to principals’ effects on student achievement. However, in both pilot years, variation in scores was limited, with most school leaders scoring in the upper third of the rating scale. As the FFL is implemented statewide, continued examination of evidence on its statistical properties, especially the variation in scores, is important.