Newspaper report on "suspect" test scores elicits criticism

March 30, 2012

California school officials in several school districts contacted by EdSource have rejected allegations of “suspect” test scores in their districts contained in a national report by the Atlanta-Journal Constitution.

The newspaper said it had found “high concentrations of suspect math or reading scores in school systems from coast to coast,” including in 40 California school districts.

Cheating, the newspaper implied, based on a complex analysis of data, is the most likely cause. The paper argued that cheating is driven by the relentless pressures on schools imposed by the No Child Left Behind law which each year raises the test score bar for schools and students and labels increasing numbers of schools as failing under the law.

At the same time, the conclusions were based entirely on the newspaper’s statistical analysis, without any actual evidence that cheating has occurred in the districts it has named. Moreover, the newspapers’ editors told EdSource Extra that none of the California school districts it named as having the most “suspect” scores — one-fifth of those identified nationwide — were actually contacted for possible explanations for the variations in test scores documented in the report. No reporting was done in the districts themselves.

The article began with this provocative sentence: “Suspicious test scores in roughly 200 school districts resemble those that entangled Atlanta in the biggest cheating scandal in American history.”

The third paragraph, however, said that the newspaper’s analysis “doesn’t prove cheating.” But it went on to say that “test scores in hundreds of cities followed a pattern that, in Atlanta, indicated cheating in multiple schools.”

It also said that in 196 out of the nation’s 3,125 largest districts with suspect test scores “the odds of the results occurring by chance alone were worse than one in 1,000.” In 33 of the districts, it said, the odds were “worse than one in a million.”

Despite the clear thrust of the report, linking test scores in the California districts to “the biggest cheating scandal in American history,” as well as the title of the series itself, “Cheating Our Children,” Kevin Riley, the newspaper’s editor, said in an email message:

We’ve not said that any school or any person has cheated. We’ve said the results show suspect scores that should be examined further. In the end, what happened is a question districts and states have to look into.

Officials in several districts interviewed by EdSource vehemently rejected any suggestion of cheating, describing the report’s findings variously as “outrageous,” “skewed and misleading” and “disappointing.”

Prompted by a similar Atlanta Journal-Constitution investigation in 2009, an independent state commission in Georgia last year found widespread cheating at 44 Atlanta public schools, alleging that nearly 200 teachers and principals simply erased incorrect answers and substituted correct ones.

While there have been incidents of cheating in California, nothing comparable to what occurred in Atlanta has been uncovered in California to date.

As the newspaper explained, it used statistics to “identify unusual score jumps and drops on state and math reading tests by grade and schools.” Declines in scores in one year can indicate cheating in the previous year. (A full explanation of the paper’s methodology can be found here.)

Among those named is the Los Angeles Unified School District, by far the largest in the state. Last summer, the district shut down six charter schools after finding they had cheated on tests. Further incidents involving charter schools were reported last fall. LA Unified officials declined to comment on the Atlanta report.

Officials in other districts contacted by EdSource Extra were far less reticent.

Art Revueltas, deputy superintendent at Montebello Unified in the San Gabriel Valley, said the “methodology used … in the study led to skewed and misleading results.” He said that over 50 percent of his districts’ students enter kindergarten as English learners and “when they learn English, we expect an increase in their test performance.”

The article, he said, pointed to a “culture of cheating in the Atlanta public schools which existed with administrative and teacher support. This in fact does not occur in the Montebello Unified School District.”

The newspaper looked at what percentage of students scored at a proficient level in one grade, say in the 3d grade, and what percentage that did so in the 4th grade. If there was a big jump, either upwards or downwards, those scores were flagged as suspect. “Districts which consistently have 10 percent or more of their classes flagged or which have an extremely high flag rate in a particular year certainly deserve further examination,” the article states.

But Revueltas said that the California Standards Tests, the standardized tests taken by millions of California students, are “not vertically aligned” so that comparing grade-to-grade scores is “irrelevant,” he said. “Grade-to-grade scores can’t be compared, ” he said.

As noted by the newspaper, the biggest gains and drops were recorded by big to medium sized cities and rural districts. The reporters say that it is districts like these that cheating is likely to be most endemic.

That’s because these districts have multiple subgroups including poor students, and those of different racial and ethnic backgrounds, whose performance on state tests determine whether a school is deemed to have made “annual yearly progress” as defined by the No Child Left Behind law.

At San Rafael City Elementary Schools north of San Francisco, deputy superintendent Rebecca Rosales said that she was “aware of the allegations” in the Atlanta report and described them as “outrageous.” The Atlanta analysis found that 14 percent of classes in her district showed suspect results, the fifth highest figure in the state.

“Our districts — and probably many others — are doing exactly what we are expected do to. We’re getting our students to a proficient and advanced level, and we’re doing it the right way.

“Districts like ours are showing signs of improvement for all significant subgroups of students. When we show academic gains, and then face the allegation that cheating is involved, [that] is fundamentally unacceptable.”

“We follow all the rules,” she said. “We’ve passed all our audits with the state. It is discouraging for an out-of-state group running these statistical analyses to say we are cheating.”

At the San Ysidro Unified School District close to the U.S.-Mexico border, some 20 percent of classes were flagged in the newspaper analysis, the third highest rate of suspect scores this year in the state.

Gloria Madera, the districts’s assistant superintendent for educational services, said that the district has been in “program improvement” under the No Child Left Behind law after failing to make Adequate Yearly Progress for two years in a row.

The district has been working with a District Assistance and Intervention Team (DAIT) as well as the San Diego County of Education to bring up its scores. Students who have scored “far below basic,” and “below basic” have been targeted. “When you bring those students up, you get additional points,” she said. “The trend has been that the district has made significant gains in the past five years.”

She said that San Ysidro students are now at the same level as students in neighboring districts with the same demographics. “We just caught up,” she said.

We have worked very, very hard as a team — students, parents teachers, administrators and support staff. Our focus has been laser like to improvement achievement. To cast a shadow on that, to say that there is anything inappropriate, it saddens me. It negates the hard work that everyone has done.

The Atlanta article cites James Wollack, an expert on test security and cheating at the University of Wisconsin at Madison, asserting that cheating is “one of the few plausible explanations for why scores would change so dramatically for so many students in a district.”

The article then goes on to state that its analysis “suggests a broad betrayal of school children across the nation” because “falsified test results deny struggling students access to extra help to which they are entitled and erode confidence in vital public institutions.”

The newspaper essentially ruled out steep test score gains in its analysis as being the result of “exemplary teaching.”

Since publication of the story last Sunday, Gary Miron, a testing expert at Western Michigan University, has leveled major criticisms against the Atlanta report. In a post published in the Washington Post he said the fluctuations in test scores are much more likely to be the result of student mobility than cheating. “The resulting news story appears to be intended to be alarmist, implying that cheating is rampant in our schools,” Miron wrote. “The irregularities (in test scores) likely arose simply because there was a large change in the actual students taking the test from year to year.”

But a data analyst from the Atlanta Journal-Constitution responded that there is high mobility in most urban school districts. “If it were true that our methodology just flagged mobility instead of potential cheating, then you would expect all urban districts with high mobility to be flagged,” he said. That, he said, was not the case.

The dueling arguments underscore the limits of basing conclusions solely on data — and the need to examine what the data means in the context of what is happening in districts themselves.

What is also clear is that after California ended its practice in 2009 of flagging schools for excessive erasures of student responses on tests, there is no systematic oversight mechanism to detect possible cheating. An earlier practice of randomly auditing between 150 to 200 school districts per year has also fallen by the wayside due to budget constraints.

A 2004 report by the Los Angeles Times found that over a five year period, some 200 teachers in school districts across the state had been investigated for possible cheating, and of those 75 cases of cheating “had been proved,” according the report. The cases it documented did not indicate anything resembling a district-wide practice such as the one uncovered in Atlanta. At the same time, pressures on school districts to meet ever higher federal “accountability” standards have increased considerably since then.

NOTE: This is an updated version of a post first published on March 30, 2012.

To get more reports like this one, click here to sign up for EdSource’s no-cost daily email on latest developments in education.

Comments (2)

Your email address will not be published. Required fields are marked * *

Click here to cancel reply.

Comments Policy

We welcome your comments. All comments are moderated for civility, relevance and other considerations. Click here for EdSource's Comments Policy.

Anne White 12 years ago12 years ago
Sounds like “Stand and Deliver” all over again. A few inspiring, successful teachers who help their students meet their own personal challenges and attain their potential will certainly lead to exceptional statistical changes.
Doug McRae 12 years ago12 years ago
Art Revueltas is absolutely correct that the methodology used by AJC is flawed for application to California test data. Comparing 3rd grade scale scores on CA's STAR program from one school year to 4th grade scale scores from the next school year, as described in the AJC methodology link, assumes that the two sets of scale scores are on a common scale. For STAR, the two sets of scale scores are independently constructed, and … Read More
Art Revueltas is absolutely correct that the methodology used by AJC is flawed for application to California test data. Comparing 3rd grade scale scores on CA’s STAR program from one school year to 4th grade scale scores from the next school year, as described in the AJC methodology link, assumes that the two sets of scale scores are on a common scale. For STAR, the two sets of scale scores are independently constructed, and thus to compare them as AJC has done is comparing apples and oranges. In technical terms, the STAR scale scores are not “vertically” scaled. The result is that the number of CA districts identified by the AJC analyses is overstated, perhaps substantially overstated (very hard to estimate how much). The most recent information I’ve seen is that only about half the states across the US have vertically scaled statewide testing systems. If this information is still accurate, then the AJC analyses overestimate the number of districts with “suspect scores” not only for California but half the states across the country.

Newspaper report on "suspect" test scores elicits criticism

March 30, 2012

Louis Freedberg

2 Comments

Louis Freedberg

March 30, 2012

2 Comments

Comments (2)

Leave a Comment

Your email address will not be published. Required fields are marked * *

Comments Policy

Anne White 12 years ago12 years ago

Doug McRae 12 years ago12 years ago

Newspaper report on "suspect" test scores elicits criticism

March 30, 2012

Louis Freedberg

2 Comments

Louis Freedberg

March 30, 2012

2 Comments

Share Article

Comments (2)

Leave a Comment

Your email address will not be published. Required fields are marked * *

Comments Policy

Anne White 12 years ago12 years ago

Doug McRae 12 years ago12 years ago

Stay informed with our daily newsletter