Quotes in this story have been updated.
More California students are meeting achievement targets in math and English language arts compared to last year, according to standardized test results released Wednesday.
In spring 2015, students took the new Smarter Balanced tests, which are aligned to more rigorous Common Core standards, for the first time. In spring 2016, the percentage of students who met the targets increased at every grade level and in every student subgroup, the new results show.
The much-anticipated results were met with some relief by education leaders, who were hoping that this year’s scores would be better than last year’s. Last year’s results were intended to set a baseline by which school and district performance could be measured going forward.
“The higher test scores show that the dedication, hard work and patience of California’s teachers, parents, school employees and administrators are paying off,” said Tom Torlakson, state superintendent of public instruction. “Together we are making progress towards upgrading our education system to prepare all students for careers and college in the 21st century.”
Scores fall within one of four achievement levels tied to the Common Core: “standard not met,” “standard nearly met,” “standard met” and “standard exceeded.” (See what each level means.)
Forty-nine percent of students in grades 3-8 and 11 met or exceeded standards in English language arts and literacy, up 5 percentage points from 2015. In math, 37 percent of students met or exceeded standards, up 4 percentage points from the year before. However, these results also showed that in both subject areas, more than half of students tested failed to meet those targets.
“Of course there’s more work to do, but our system has momentum,” Torlakson said. “I am confident that business, political and community leaders will join parents and educators to help continue supporting increased standards and resources for schools.”
The results also revealed wide disparities in achievement among student groups, with 62 percent of English language learners, 44 percent of African-Americans, 38 percent of low-income students and 36 percent of Hispanic students scoring in the lowest of the four achievement levels. This compared with 16 percent of white students and 11 percent of Asian students who scored in the lowest level. Still, these subgroups all saw gains of between 2 and 4 percentage points in meeting or exceeding standards, the results showed.
The tests make up the main component of the California Assessment of Student Performance and Progress system, known as CAASPP.
More than 3.2 million California students took the tests, but more than 22,300 opted out with parental consent, representing less than 1 percent of students.
Although many districts were pleased with their growth, California School Boards Association President Chris Ungar called the results “mixed” and issued a warning about “alarming” low overall results and persistent gaps in achievement.
“If we are to close the achievement gap and create a public school system that offers consistently high levels of education, we need to be focused much more intentionally on questions of equity and questions of adequacy,” he said, in a prepared statement. “It goes beyond test scores – we must give districts and schools the level of resources, innovation and flexibility required to devise solutions that meet the needs of their specific student populations.”
The assessments, which are given online, require students to write short answers to questions, read passages and think critically about them, and use problem-solving skills.
The tests are adaptive, meaning they change depending on how a student answers a question. If a student answers a question correctly, the next one is more difficult. If a student answers incorrectly, the next question is easier.
In 11th grade, 59 percent of students met or exceeded the standard in English language arts, compared with 33 percent who did so in math. This represented growth of 3 percentage points in English language arts and 4 percentage points in math.
Most community colleges and the California State University system use the “standard exceeded” level to determine that students are ready for college and do not need to take remedial courses. The “standard met” level indicates that students are conditionally ready for college, but must take an approved yearlong math and/or English course their senior year and pass with a C or better. The Smarter Balanced 11th grade test replaces the state’s previous Early Assessment Program.
“These positive results are based on a new college and career readiness assessment that is online, and expects students to demonstrate critical thinking and problem solving skills unlike the old, multiple choice tests they replace,” said State Board of Education President Mike Kirst.
James Popham, a UCLA emeritus professor and a member of the National Assessment Governing Board, said the rise in scores is not surprising.
“In general, whenever you start a new test and people are getting used to what it measures,” he said, “you do see an increase in the second year.”
However, he said it was impossible to anticipate what would be regarded as a typical percentage point increase. He also said the fact that so many students failed to meet standards could be expected in the early years of the administration of a test, especially a more rigorous one like Smarter Balanced.
“This is a test measuring what many people think represents more challenging, difficult types of content,” he said, “so it would not be surprising.”
Regarding the achievement gap, Popham said it makes sense to spend money in ways that will help struggling students.
“You have a new test and you have hope that students will master what’s in the new test,” he said. “Clearly, what you need to do is put in sufficient resources, so if you do a good job on instruction, scores will in fact rise.”
The amount of time it will take for large percentages of students to meet achievement targets, he said, is directly related to the difficulty level of what they are learning.
“The more challenging and the more demanding the standards,” he said, “the longer it will take to really get them mastered.”
Torlakson suggested that several factors may have contributed to this year’s higher scores. These included another year of teaching the standards, students’ increased familiarity with the online tests, technology improvements, and the use of interim tests to gauge student progress and help prepare them for the end-of-year assessments.
However, he said many schools and districts are still transitioning to the new standards and continued persistence will contribute to ongoing improvement.
Still, some districts and charter organizations were pleased with their progress. Both the Fresno and Garden Grove districts saw students at every grade level make gains in both subject areas, with Fresno’s growing by 4 percentage points overall in English language arts and math, while Garden Grove’s scores increased by 6 percentage points in both areas.
Fresno Superintendent Michael Hanson acknowledged that his district still has a lot of work to do, with only 22 percent of students meeting or exceeding standards in math and 31 percent meeting or exceeding them in English language arts. But he was quick to celebrate the gains made by district students, especially in 3rd- and 4th-grade math, which grew by 8 and 6 percentage points, respectively.
“Our growth is good and strong,” he said. “We like it. I think people realize that these standards, if used as intended, give teachers great latitude to be creative and challenge the heck out of kids. And that comes with great pressure. So, we’re trying to come beside teachers and support them as best we know how.”
Garden Grove Superintendent Gabriela Mafi said that of districts with 65 percent of students or higher receiving free or reduced price lunches, hers had the highest percentage of students meeting or exceeding standards, with 55 percent of students doing so in English language arts and 43 percent of district students achieving those goals in math.
“Our teachers and administrators have demonstrated an unwavering commitment to implementing the state standards in a meaningful way that results in tremendous academic gains for our students,” she said. “We have placed a strategic emphasis on equipping every student with the academic and personal skills needed for lifelong success and have offered targeted support to students.”
To get more reports like this one, click here to sign up for EdSource’s no-cost daily email on latest developments in education.
We welcome your comments. All comments are moderated for civility, relevance and other considerations. Click here for EdSource's Comments Policy.
Don 7 years ago7 years ago
No respond option to Doug's last comment below. Just want to say thanks to Doug for the explanation. Doug pretty much laid out the courses of action. I don't know what else one can say at this point other than to note that we will have to wait and see whether the political climate can oblige the call for transparency with SBAC. It's a good thing SBAC is not replacing the CAHSEE as … Read More
No respond option to Doug’s last comment below. Just want to say thanks to Doug for the explanation. Doug pretty much laid out the courses of action. I don’t know what else one can say at this point other than to note that we will have to wait and see whether the political climate can oblige the call for transparency with SBAC. It’s a good thing SBAC is not replacing the CAHSEE as some have suggested otherwise in a few years from now we might have to award diplomas retroactively – again.
Don 7 years ago7 years ago
My understanding (and please correct me if I'm wrong, Doug?) is that due to variables associated with a new standard and new assessment, it takes approximately three years to discount these variables to arrive at a statistically valid baseline. And this era brings with it considerable technological changes that may lengthen the break-in period. I think we need to be careful when we make assumptions about changes in student achievement based only upon results … Read More
My understanding (and please correct me if I’m wrong, Doug?) is that due to variables associated with a new standard and new assessment, it takes approximately three years to discount these variables to arrive at a statistically valid baseline. And this era brings with it considerable technological changes that may lengthen the break-in period. I think we need to be careful when we make assumptions about changes in student achievement based only upon results from 2015 and 16. Also of concern is the validity reliability issue with SBAC which puts into question the absolute results, particularly for certain subsets of data.
Doug McRae 7 years ago7 years ago
Well, Don, it is feasible to transition to a new assessment system over 3 or 4 years that manages many of the variables associated with new academic content standards and new technology-based test administration that provide validity reliability and fairness during the transition. But that is easier said than done, and CA rejected that approach with AB 484 in 2013. So, we are where we are, and I would agree with your cautions … Read More
Well, Don, it is feasible to transition to a new assessment system over 3 or 4 years that manages many of the variables associated with new academic content standards and new technology-based test administration that provide validity reliability and fairness during the transition. But that is easier said than done, and CA rejected that approach with AB 484 in 2013. So, we are where we are, and I would agree with your cautions about assumed changes in student achievement based only on Smarter Balanced results for not only 2015 and 2016 but also 2017 and perhaps 2018 until we have a stable assessment system that addresses the variables you mention.
That being said, we have the 2016 results and gain data from 2015 to 2016 released today, and we need to interpret it as best we can. My initial observations on today’s release may be found via this link, https://www.documentcloud.org/documents/3034470-DougMcRae-ca2016SmarterBalancedresults082416.html. In addition to five bullet point observations, this link includes context information for how other consortia states (PARCC as well as Smarter Balanced) have done with 2016 results and gain scores, for states that have already released their 2016 results. This context info frames how well CA has done, putting CA’s 2016 gain scores into a GPA-like metric (Total Average Gain) of 3.8 percentage points. I would characterize that as “very good” similar to a GPA of 3.8 on a 4.0 scale (with AP credits allowing for more than 4.0). This translates to a letter grade of A minus. Further historical context is available from STAR gain scores from 2002 to 2013 using that same GPA-like metric, with results that ranged from less than 0 to a high of 4.8 with an average over 12 years of about 2.25 [C+] over the 12 years of STAR gain scores. In 2003, the year STAR CST assessments first “counted” for local schools and districts, the GPA-like metric was 3.95, very much in the same ballpark as the 2016 California Smarter Balanced gain of 3.80.
There are many issues to be discussed relative to the California Smarter Balanced scores released today, including issues in the August 11 EdSource post on CA’s new statewide accountability system (aka, new school improvement system), including a link to my observations on CA and SBAC submissions to the feds for required “peer reviews.” These are more technical validity reliability fairness issues. But enough for today. I’d be happy to respond to any questions on the material in the link above, or other issues related to CA’s assessment and accountability systems, and provide my opinions on those issues.
Don 7 years ago7 years ago
Doug, it is not typical for me take issue with your opinions. In the case of your "very initial observations," I am reminded of the John Scopes student who was purported to have said something to the effect, " I agree with evolution, but not that monkey business." You start giving the 2016 results a grade of A-, but your subsequent points, 2-5, handicap those results by highlighting the differences in standardization between 2016 and … Read More
Doug, it is not typical for me take issue with your opinions. In the case of your “very initial observations,” I am reminded of the John Scopes student who was purported to have said something to the effect, ” I agree with evolution, but not that monkey business.”
You start giving the 2016 results a grade of A-, but your subsequent points, 2-5, handicap those results by highlighting the differences in standardization between 2016 and 15, all of which differences are patently obvious despite some lack of data due to collection issues and policies. You close saying we shouldn’t assign meaning to differences between the Math and ELA scores. I’m hard pressed to assign meaning to 2016 and 2015 results either.
Regarding California’s relatively poor absolute showing versus other consortium states, albeit lacking any similar school-type comparison, I wonder to what extent the improvement is a byproduct of the statistical phenomenon whereby underperformers (and we have a lot) have more upside? I haven’t had time yet to do a comprehensive if amateurish analysis of SFUSD’s achievement gap for 2016, but some grades have seen an increase in that gap with a simultaneous rising tide across the board. This, of course, does not bode well for the Common Core, but then, I just got finished berating SBAC validity and I cannot then cite its results to prove a point.
Don 7 years ago7 years ago
Regarding my previous comment, unrelated as the Scopes Monkey Trial is to this subject it was more intended as levity than anything else. Though I didn’t specifically say so, the point I wanted to make is that Doug has courageously raised concerns over the years of the shortcomings of SBAC. With the incremental (snail’s pace) changes in achievement that characterize the challenges in education, questions as to test reliability, validity and fairness might easily outpace those changes.
Doug McRae 7 years ago7 years ago
Don -- Responding to your first reply 2nd/3rd paragraphs: The GPA-like grade I assigned to this year's Smarter Balanced results (A minus) is based only on how much the results increased from 2015 to 2016 [3.8 points per my Tot Ave Gain statistic]. Overall increases are the first thing folks usually look at, and my grading scheme helps interpretation for how much of an increase is good or not-so-good since GPA's have interpretative juice … Read More
Don — Responding to your first reply 2nd/3rd paragraphs: The GPA-like grade I assigned to this year’s Smarter Balanced results (A minus) is based only on how much the results increased from 2015 to 2016 [3.8 points per my Tot Ave Gain statistic]. Overall increases are the first thing folks usually look at, and my grading scheme helps interpretation for how much of an increase is good or not-so-good since GPA’s have interpretative juice for the public as well as educators and measurement specialists. The “A minus” is not intended to describe the overall quality of the large scale assessment system generating the data, nor interpretation of its ability to quantify or track secondary analyses such as achievement gaps or college readiness for 11th graders. So my additional initial observations for such interpretations describe limitations in the data that should be taken into account when interpreting and reporting on those secondary analyses.
The bottom line is that even with the flaws in the current Smarter Balanced testing system, the data are what we have and we need to make the best interpretations we can, including acknowledging limitations. I don’t think it is productive to throw out the baby with the bath water, to ignore the entire data set because there are flaws or limitations for various interpretations.
Don 7 years ago7 years ago
Doug, 4 and 5% achievement increases in math and ELA seem quite high by historical standards. I noticed in Torlakson's press release he didn't mention any deficiencies with the assessments themselves. So the kind of warning you provide to readers here is not being provided by the State to the public at large, not that I'm surprised. And the average person would be hard pressed as to what to make of those deficiencies when … Read More
Doug, 4 and 5% achievement increases in math and ELA seem quite high by historical standards.
I noticed in Torlakson’s press release he didn’t mention any deficiencies with the assessments themselves. So the kind of warning you provide to readers here is not being provided by the State to the public at large, not that I’m surprised. And the average person would be hard pressed as to what to make of those deficiencies when analyzing current results. It seems that everyone is. How will we know when we have accurate results by professional assessment standards if the baseline is questioned?
Doug McRae 7 years ago7 years ago
Don -- To answer your question, fundamentally you have to fix the underlying problems and then recalibrate prior data in an attempt to retroactively maintain comparability of results over time back to the baseline year. Connecticut did this with their ELA test this year [see the footnote in my state-by-state comparisons document linked above] and I think succeeded for within-CT data but lost comparability across SBAC states. So, it can be done. For small year-to-year changes … Read More
Don — To answer your question, fundamentally you have to fix the underlying problems and then recalibrate prior data in an attempt to retroactively maintain comparability of results over time back to the baseline year. Connecticut did this with their ELA test this year [see the footnote in my state-by-state comparisons document linked above] and I think succeeded for within-CT data but lost comparability across SBAC states. So, it can be done.
For small year-to-year changes in tests, CA did recalibration on a routine basis for the STAR program. As Ed Haertel used to say ad nauseum when he led the construction of a new accountability system for CA 15 years ago “If you want to measure change, you cannot change the measure.” Adding new testing components or making smallish changes in the existing components would essentially change the measures, so recalibrations had to be done on an annual basis. However, the fixes needed to correct the larger flaws in the baseline Smarter Balanced tests, particularly the SBAC acknowledged gaps in its initial adaptive item bank that compromised the accuracy of scores at the low end of the performance continuum (thus affecting accuracy of achievement gap analyses), and the flaws in SBAC’s achievement standards-setting (cut scores) in 2014 that particularly compromised scores at the HS level, are much larger changes may or may not be amenable to a recalibration strategy. The flaws still have to be fixed, but it is possible that CA would lose trend lines back to 2015 with the required fixes.
The alternative options are not attractive. One is to live with the limitations due to the flaws for the next 10 or 12 years until the current versions of Smarter Balanced tests run their course and need replacement. Another is to replace Smarter Balanced altogether in the near future and go to different set of tests. But, the first step has to acknowledge that there are flaws in the initial SBAC system that require fixing in one fashion or the other.