A University of Southern California professor has collected dozens of academics’ signatures on a letter to U.S. Secretary of Education John King criticizing how the federal government proposes to measure student scores on standardized tests. California’s top state education officials agree with him and may express the same point of view in a letter they’re drafting.
Morgan Polikoff says that that the proposed regulations would continue the same flawed methodology used under the federal No Child Left Behind law. An associate professor at USC’s Rossier School of Education, he suggests an alternative approach consistent with the direction the California State Board of Education is taking.
At issue is how to measure achievement in standardized tests in math and English language arts under the federal Every Student Succeeds Act, the successor law to NCLB that Congress passed in December.
The proposed standard would be the number of students who score proficient in math and English language arts. Under NCLB, schools were required to gradually approach the target of 100 percent proficiency, which nearly all school schools failed to achieve, leading former Secretary of Education Arne Duncan to grant most states waivers from the law’s penalties.
ESSA would not impose a hard-and-fast improvement target; states would be required to intervene in the 5 percent of the lowest-performing schools. But Polikoff and the others who signed the letter argue using the percentage of students who score proficient each year remains problematic. They say it creates incentives for schools to focus primarily on students who near proficiency rather than all students in a school and to devote less attention and resources to students who are way below proficient or who could advance to well above that level. The proficiency target also penalizes schools serving large numbers of low-performing students, since they’re not given credit for significant improvements even if they fall short of proficiency, Polikoff argues.
Polikoff suggests using the average student score per grade in a school, along with average scores for racial, ethnic and socioeconomic subgroups of students as the measure of performance. That way, all students’ scores would be included, not just those who score proficient.
The state board voted in May to combine both a school’s and district’s annual scores and its growth in scores over several years as the basis for a new indicator of academic achievement. The state wants to move away from a single point designating minimum proficiency as the sole standard, David Sapp, deputy policy director of the state board, said this week.
Under the Smarter Balanced assessments that students in California and 14 other states administered this year, students will receive a score ranging from 2,000 to 3,000. A student’s score falls within one of four achievement levels: standard not met, standard nearly met, standard met (which the federal government considers the equivalent of proficiency), and standard exceeded.
Last fall, the state Department of Education released the first-year Smarter Balanced test results by percentage of students in each achievement level for every school and district. Parents received a report with their child’s individual score on the scale. The Department of Education is waiting for the results of this year’s testing so that it can do a first-year report on growth in scores.
Polikoff said that an alternative to using students’ average scores would be to report results by achievement levels.
Sapp said that State Board of Education President Michael Kirst and State Superintendent of Public Instruction Tom Torlakson plan to submit a letter to King on the proposed regulations later this month, and that it may take the same view Polikoff expressed.
More than 40 researchers and others in the education field have signed Polikoff’s letter. They include Linda Darling-Hammond, president of the Learning Policy Institute and an emeritus education professor at Stanford University, and Heather Hough, the executive director of a research partnership between the California nonprofit PACE and the six California districts, known as CORE, which have developed their own school improvement index under a federal waiver from NCLB.
To get more reports like this one, click here to sign up for EdSource’s no-cost daily email on latest developments in education.
We welcome your comments. All comments are moderated for civility, relevance and other considerations. Click here for EdSource's Comments Policy.
Doug McRae 7 years ago7 years ago
The "percent proficient" and above calculation has become the routine aggregate data measure for standards-based test results across the country for the past 15 or so years primarily because it is simple and user-friendly for parents and the public (and for the media) compared to the alternatives suggested in Polikoff's letter. The negatives for the percent proficient choice have been well known to testing specialists for many years, but the negatives for Polikoff's suggestions [using … Read More
The “percent proficient” and above calculation has become the routine aggregate data measure for standards-based test results across the country for the past 15 or so years primarily because it is simple and user-friendly for parents and the public (and for the media) compared to the alternatives suggested in Polikoff’s letter. The negatives for the percent proficient choice have been well known to testing specialists for many years, but the negatives for Polikoff’s suggestions [using average or “mean” scale scores, or an index approach not unlike CA’s API] have been widely cited as not user-friendly. Polikoff makes a legitimate point in his advocacy letter, but there are pros and cons for his alternatives.
Overlooked by the media thus far has been the line in the post that the CA Dept Educ “is waiting for the results of this year’s testing (i.e., 2016 scores) so that it can do a first-year report on growth in scores.” The State Board agenda materials for their July meeting released July 1 (including a June 28 Info Memo also released July 1) and the CDE presentation at the July 13 board meeting [Item # 2, Powerpoints 30-32] indicate that CDE will not be posting 2-yr growth scores for aggregate Smarter Balanced results to be used for their proposed new CA accountability system until technical specialists review the technical characteristics of 2-yr change scores for Smarter Balanced results. The materials indicate it may be necessary to wait for 3-years of Smarter Balanced results before technically sound growth or change scores for aggregate data can be reported. Given the promises to date for use of Smarter Balanced metrics for growth or change scores, this techy information buried in the July SBE agenda and presentation materials is more newsworthy than the concerns expressed in the Polikoff letter.
navigio 7 years ago7 years ago
Agree on the first paragraph, not on the second. Whether we wait for a third year before we have some 'meta-metric' to talk about is probably less critical than whether the metric is of any value for the decade or more it will likely be used for. There will be plenty of 'private' analyses done after second year data is released anyway. And there is no guarantee what we would get now by not waiting … Read More
Agree on the first paragraph, not on the second.
Whether we wait for a third year before we have some ‘meta-metric’ to talk about is probably less critical than whether the metric is of any value for the decade or more it will likely be used for. There will be plenty of ‘private’ analyses done after second year data is released anyway. And there is no guarantee what we would get now by not waiting would be all that great either.
Although an average value also has its problems, the proficiency threshold is one of the worst ways to make comparisons between schools, especially when the distribution of involved schools is centered to one or the other side of that cutoff. It can also lead to an exacerbation of the true ‘achievement gap’. Furthermore, the sbac score range is (ostensibly) explicitly designed to offer a much more granular level of measurement, even one that transcends grade levels (to which proficiency thresholds must obviously be tied). It would be nice if such a discussion could be put into the public domain.
Doug McRae 7 years ago7 years ago
Navigio: Reasonable folks can disagree, especially when it comes to speculations on future implications for current policy options. On the proposed federal proficiency metric vs Polikoff's average measure, yup the distribution of scores for the school or district or subgroup is the important factor determining whether one metric is better than the other. But you can find distribution patterns that violate good interpretations for both metrics, so there isn't a clear choice that satisfies all distribution … Read More
Navigio: Reasonable folks can disagree, especially when it comes to speculations on future implications for current policy options.
On the proposed federal proficiency metric vs Polikoff’s average measure, yup the distribution of scores for the school or district or subgroup is the important factor determining whether one metric is better than the other. But you can find distribution patterns that violate good interpretations for both metrics, so there isn’t a clear choice that satisfies all distribution patterns. The SBAC score range should yield a more granular level of measurement, due to its computer-adaptive feature, but the tech information on actual CA implementation of SBAC in 2015 shows a very uniform administration of 39-40 ELA CAT items and 34-35 Math CAT items to the vast majority of students at all grade levels. That result calls into question the effectiveness of the SBAC computer-adaptive algorithm; at the least, it did not perform in a typical computer-adaptive manner, and that calls into question the granularity of the 2015 SBAC measurement for CA. I’d certainly agree discussions like this should be in the public domain, given that the entire CA SBAC exercise involves taxpayer dollars.
On waiting for a 3rd year of data for a growth or change metric for aggregate data, I’ll agree there likely will be “private analyses” done after the 2nd year of data is released, but to have those analyses done before the professionals charged with recommending how they should be done will yield a pot full of varied interpretations in the trenches and that is not a good thing. At the least, there should be an informed recommendation for how to conduct change analyses based on the reality of the change data before the new data system has a chance of generating consistent interpretations across multiple schools and districts and subgroups. Data chaos is not a positive outcome. I’m on the side of getting it right rather than getting it fast; there is a lot of value in the old saw that a-stitch-in-time-saves-nine.