Students will take computerized field tests aligned to Common Core standards in math and English next year, state officials announced. Credit: EdSource file photo

Students will take computerized field tests aligned to Common Core standards in math and English next year, state officials announced. Credit: EdSource file photo

Faced with potentially tens of millions of dollars in fines, the state Department of Education has headed off a confrontation with the federal government over standardized testing.**

Superintendent of Public Instruction Tom Torlakson announced Thursday that he would require school districts to offer the Common Core practice tests, created by the Smarter Balanced states’ consortium, in both math and English language arts next spring. A new law changing the state’s standardized testing program, Assembly Bill 484, which Torlakson and Gov. Jerry Brown supported and that sparked a dispute with the federal government, required only that students be given one of the assessments, although it didn’t explicitly prevent Torlakson from offering both tests.

The state’s one-test policy was at odds with long-standing federal law, that all students in for grades 3 to 8 and grade 11 be tested annually in both subjects. And it prompted an assistant secretary of the U.S. Department of Education to warn state officials last month that the feds might withhold $45 million to the state Department of Education, plus potentially larger amounts in federal Title I dollars for low-income and special education students.

Torlakson’s carefully worded news release makes no mention of the conflict with the federal government or a concern over districts’ capacity to administer computer tests in both subjects next spring. Deputy State Superintendent Deb Sigman had repeatedly stated over the past month that that districts would get as much benefit from offering one field test as from offering both. And she said that the state was worried about overloading districts as they move from state tests, using paper and pencil, to computer-administered Common Core assessments.

“This move to up-to-date new assessments marks a major step forward in California’s work to ensure that every student graduates equipped to succeed in college and careers,” Torlakson said. “These field tests simply make good sense, and expanding them to include both subjects for most students makes even better sense – in contrast to ‘double testing’ students, which makes little sense at all.”

There’s no guarantee that the state’s revised policy will satisfy the feds. U.S. Secretary of Education Arne Duncan said that he would consider allowing states to substitute the Common Core field test in spring 2014 for the annual tests in state standards for at least some students. But states would have to apply for a waiver from double testing students to do this. The deadline to apply is Friday – Nov. 22 – and State Board of Education President Michael Kirst and Torlakson submitted the waiver request  and letter to U.S. Assistant Secretary Deborah Delisle on Thursday.

In an interview, Chief Deputy State Superintendent Richard Zeiger said that the state will offer a shorter form of both the math and English language arts field tests that together take 3½ hours – no longer than the full field test in either subject. As a result, it should impose no further burdens on school districts’ capacity or time. And it will be done within the existing budget, he said.

“We managed to strike the appropriate balance here,” he said.

Zeiger said that the state doesn’t know if the shorter tests will satisfy the federal government’s requirements, but he hopes that they will. The state did not negotiate its proposal with federal officials, he said.

A few states have indicated that they would seek a statewide waiver for all students. But most plan to give one of the Common Core field tests to 10 to 20 percent of their students and to continue giving some form of state tests to the rest.

Torlakson and the State Board of Education took the position that the transition to the Common Core standards required a clean break from tests under the old state standards, so that teachers and districts could concentrate on preparing for the new assessments and Common Core. Consistent with that, AB 484 terminated nearly all California Standards Tests, effective Jan. 1. Official Common Core tests will debut in the spring of 2015.

A field test is essentially a test of a test – a method to screen questions and determine their level of difficulty, a necessary step before rolling out the official assessment. Smarter Balanced will not release the results, because a field test cannot produce valid scores for measuring individual student or school achievement.

Organizations representing teachers, administrators and school boards supported AB 484 and Torlakson’s one practice test proposal. Torlakson and state officials were miffed when the eight districts that formed the California Office to Reform Education called on Duncan to demand that the state give districts both the math and English language arts tests.

In a statement on behalf of CORE on Thursday, Fresno Unified Superintendent Michael Hanson said, “We applaud Superintendent Torlakson and President Kirst’s announcement as we have consistently advocated throughout this process for ensuing all youth have access to the field test. More than anything else, schools and districts are hungry for information as we undertake this unprecedented implementation.”

**This story has been updated.

John Fensterwald covers state education policy. Contact him or follow him on Twitter @jfenster. Sign up here for a no-cost online subscription to EdSource Today for reports from the largest education reporting team in California.

To get more reports like this one, click here to sign up for EdSource’s no-cost daily email on latest developments in education.

Share Article

Comments (15)

Leave a Reply to John Fensterwald

Your email address will not be published. Required fields are marked * *

Comments Policy

We welcome your comments. All comments are moderated for civility, relevance and other considerations. Click here for EdSource's Comments Policy.

  1. Tara Kini 10 years ago10 years ago

    On a related note, see this blog post below discussing the irony in Duncan's threatened punishment of California over AB 484 given the Obama Administration's total backtracking on teacher equity. "The harm to California’s low-income students of a short gap without a state standardized test score is dwarfed by the life-long effects that millions of low-income and minority students nationwide will experience as a result of the Department’s failure to monitor and enforce their right … Read More

    On a related note, see this blog post below discussing the irony in Duncan’s threatened punishment of California over AB 484 given the Obama Administration’s total backtracking on teacher equity. “The harm to California’s low-income students of a short gap without a state standardized test score is dwarfed by the life-long effects that millions of low-income and minority students nationwide will experience as a result of the Department’s failure to monitor and enforce their right to equitable access to qualified, experienced, and effective teachers.”

    http://www.washingtonpost.com/blogs/answer-sheet/wp/2013/12/02/obama-administration-backtracks-again-on-teacher-equity/

  2. Doug McRae 10 years ago10 years ago

    Navigio: Re the 2014 "field" tests, there is another explanation that is a bit more nefarious. Traditionally, the terminology "field" test for a large scale K-12 test development project has been an exercise using the final tests to generate the data necessary to do (for standards-based tests) the standards-setting or cut scores for use the first year of "operational" tests. The item-tryout exercises have to be done the previous year to qualify items for the … Read More

    Navigio:

    Re the 2014 “field” tests, there is another explanation that is a bit more nefarious. Traditionally, the terminology “field” test for a large scale K-12 test development project has been an exercise using the final tests to generate the data necessary to do (for standards-based tests) the standards-setting or cut scores for use the first year of “operational” tests. The item-tryout exercises have to be done the previous year to qualify items for the traditional field test exercise. But, both Smarter Balanced and PARCC did not generate a sufficient number of qualified test questions from their spring 2013 “pilot” tests to do the traditional field tests spring 2014 — instead, both need to tryout more items to generate a pool of qualified items for a full field test exercise, and they are doing that work spring 2014. What SBAC and PARCC are labeling the first full operational testing year, spring 2015, will be in fact be used for the same function that has traditionally been called “field testing,” with final or official standards-setting or cut scores based on 2015 data. Why this shuffle in test development terminology? Well, SBAC and PARCC federal grants from the feds expire Sept 2014, and so they would like everyone to think the test development work is complete and 2015 is an operational testing year, when in fact it is really the last full year of test development needed to get interpretable or meaningful scores. So, the first year (2015) states pay full freight for the consortia tests will be in fact the final year of test development, but that reality conflicts with the promise that belonging to the consortium will save states large $$ on test development work. When one looks into the costs for test development, not only the initial test development work being done by SBAC and PARCC but also the costs for on-going test development work needed for replensihing item banks, etc, during a 5 or 10 year run for SBAC or PARCC tests, one discovers that the $$ savings are a lot less than advertised. In any event, 2015 for Smarter Balanced will really be the last year for test development, with the data used for standards-setting 3rd Q 2015, and unlike the 2014 data it will be possible to generate meaningful interpretable scores but those scores will likely not be available for local district use until 4th Q 2015. That little factoid has not been advertised by either consortia, nor by state level advocates for using the consortium tests.

    Re your comments on integrated vs traditional course sequences, my only comment is there is a lot more going on than initially meets the eye. I’m kinda agnostic on traditional vs integrated math; my wife tells me that, as a testing guy, I shouldn’t take sides on curriculum wars . . . . she was on the state’s curriculum commission for four years, served as chair one year, and thus is a veteran of curriculum wars like integrated vs traditional sequences for math and science, so I value her experience and advice.

    Replies

    • navigio 10 years ago10 years ago

      Thanks Doug. One thing i've learned about education policy: there is always a more nefarious answer.. ;-) Personally, I always like talking about that too because even when it does not come about via intention, it often is a result of 'market' forces, and thus is still meaningful. Anyway, your clarification make sense. I also agree with you on the what-meets-the-eye front. The integrated transition is something that has really captured my attention. The … Read More

      Thanks Doug. One thing i’ve learned about education policy: there is always a more nefarious answer.. 😉 Personally, I always like talking about that too because even when it does not come about via intention, it often is a result of ‘market’ forces, and thus is still meaningful. Anyway, your clarification make sense.

      I also agree with you on the what-meets-the-eye front. The integrated transition is something that has really captured my attention. The amount of change seem tremendous, yet few if any people seem to know its happening or why.

  3. Doug McRae 10 years ago10 years ago

    Navigio: Here are several observations on the questions you asked above -- Re the 12-week window, several factors are that item-tryout studies are not strongly sensitive to time-of-instructional year like operational programs where you want all students taking the test within a relatively small window for comparisons across students or groups of students, nor are you generally concerned about test security since it is a low stakes research data collection. So, generally test developers can be more … Read More

    Navigio:

    Here are several observations on the questions you asked above —

    Re the 12-week window, several factors are that item-tryout studies are not strongly sensitive to time-of-instructional year like operational programs where you want all students taking the test within a relatively small window for comparisons across students or groups of students, nor are you generally concerned about test security since it is a low stakes research data collection. So, generally test developers can be more flexible with test administration windows. Most likely the need to spread out the test administration window for schools with fewer technology resources also played a role.

    The decision to cut the item-tryout design in half for each content area I think was purely to answer the call from selected districts/schools that wanted to test both content areas. From a student perspective, the original design called for students to take (say) 50 short answer questions and one performance task all in a single content area. The revised design calls for (say) 25 short answer questions in E/LA plus 25 short answer questions in Math plus one performance task in either E/LA or Math. From a research design perspective, the data is analyzed individually by item so what other items the student took makes no difference; the item data for an individual student are never aggregated to generate a total score of any sort, not even a total number of items correct. Each student gets a more or less random sample of items, some students will get a bunch of hard ones, some will get easy ones, in fact the test developer doesn’t know yet which items will be hard or which ones will be easy, and in fact the expectation is that roughly half the items in the study will not have the right statistical characteristics to survive and be included in the final operational tests anyway. I’m not sure there is any reason for the feds to have any view at all, pro or con, about the original design of only one content area per kid vs the new design with half the items for each of two content areas per kid.
    The key is an item-tryout study is not designed to generate interpretable or meaningful results for students, no total scores by kid, no scores converted to anything like basic, proficient, advanced (the test development studies to generate cut scores come later in the test development process, after the bank of items to be used is determined, items are put into collections to match test blueprints, and total scores can be calculated).

    Finally, re your comment on MA’s SSPI considering the consortium assessments as de facto college readiness metrics, I think the grade 11 tests that are being developed will receive a lot of scrutiny once folks get to see what they are. The early indication, for example, is that the grade 11 math test will be an Algebra II level test — essentially the minimal math requirement for entrance purposes for a pretty competitive university. That’s gonna be a tough requirement for an across the board test for all 11th graders, and I think a lot of folks will rebel against an Algebra II proficiency expectation for 100 percent of our high school students. The thinking thus far clearly comes from the higher ed math professors on the consortia advisory panels. The whole notion of career readiness hasn’t come into considerations for consortia 11th grade tests as yet, nor has the notion that math college readiness probably needs to be differentiated to some degree for future poets, future journalists, future historians, future lawyers, as contrasted to future STEM majors. In other words, I think there is a lot of water to flow before the grade 11 tests are ready to take final shape, and one way to interpret the MA’s SSPI remarks is a view that the grade 11 tests may take longer to develop than the grades 3-8 consortia tests.

    Hope these observations help.

    Replies

    • navigio 10 years ago10 years ago

      Thank you Doug. Those were two extremely useful observations. Much appreciated. Your characterization of the field tests makes it clear why we should not even want to expect results from them. IMHO, that needs to be marketed better. It does raise my curiosity about why ab484 didn't just outlaw the API for a year explicitly, rather than just giving the option. I guess the answer is that we still have other testing legacies remaining. Still curious … Read More

      Thank you Doug. Those were two extremely useful observations. Much appreciated.

      Your characterization of the field tests makes it clear why we should not even want to expect results from them. IMHO, that needs to be marketed better. It does raise my curiosity about why ab484 didn’t just outlaw the API for a year explicitly, rather than just giving the option. I guess the answer is that we still have other testing legacies remaining. Still curious though.

      And I agree with your concerns regarding college readiness. I have noticed something interesting in districts that are considering moving to the integrated match course sequence. Namely, that when it’s done as a transition, there may not even need to be a math III course for two or three years. Part of the implication is that even when we end up giving the ‘real’ tests, we will have students taking two different course sequences (even in the same school) and (obviously) taking different tests. There is going to have to be all sorts of cross-referencing if we want those to be meaningful. That, in addition to your previous comments on the nature of the delay built into the initial phase of test design.

      On a slight tangent, I have noticed that some high performing districts appear to have decided against the integrated path. That is curious given my understanding of the value of the integrated path over the traditional one. Id love to see edsource follow up on the story they did recently related to integrated math (and now science).

      Thanks again.

      • John Fensterwald 10 years ago10 years ago

        navigio: Yes, what I hear anecdotally confirms what you have been noticing. High-performing districts (also higher wealth districts with parents who are eyeing UC for their children) appear to be keeping the traditional sequence in math perhaps because it has worked for them. Hold me to it: I do plan to do the integrated math story, probably early next year.

        • navigio 10 years ago10 years ago

          Have to admit, that is concerning. UC also accepts the integrated track, so there must be something else driving this.

          I’ll keep on you till at least next spring.. 😉

  4. navigio 10 years ago10 years ago

    I find it noteworthy that the MA commissioner considers common core and its associated assessments as de facto college-readiness metrics (especially with the implication that the previous ones were not (?!) ). Without really deciding what the role of college is in our society, I think that has the real potential to explicitly alienate some young people (should it?). While I do think its right to have this as a goal for every student, I … Read More

    I find it noteworthy that the MA commissioner considers common core and its associated assessments as de facto college-readiness metrics (especially with the implication that the previous ones were not (?!) ). Without really deciding what the role of college is in our society, I think that has the real potential to explicitly alienate some young people (should it?). While I do think its right to have this as a goal for every student, I think its wrong to assume not measuring that can be justified by the fact that not everyone will make it, especially if one believes that measurement helps get people there (can of worms..).

  5. John Fensterwald 10 years ago10 years ago

    Suz: there will not be a paper option next spring for the field test. So it's a good question what districts without the capacity to administer a test by computer will do. The state has effectively promised the feds that all students will take the field test. Looking ahead to the operational or official test in spring 2015, there will be a paper and pencil option through spring 2017. It will be a big challenge to … Read More

    Suz: there will not be a paper option next spring for the field test. So it’s a good question what districts without the capacity to administer a test by computer will do. The state has effectively promised the feds that all students will take the field test.

    Looking ahead to the operational or official test in spring 2015, there will be a paper and pencil option through spring 2017. It will be a big challenge to compare the results of students who take the paper and pencil test with those who take a computer-adaptive test. Smarter Balanced says it can be done; there are skeptics.

  6. Doug McRae 10 years ago10 years ago

    OK, now that John's post is complete, let me comment on the sidebar re MA's strategy to slow down implementation of consortium computerized common core tests, and compare it to California's strategy to speed it up via the two content area change for 2014. First, I'd note that Louisiana also announced plans this week to slow down their implementation of consortium tests, as reported in EdWeek's State EdWatch blog yesterday. Re the MA plan, … Read More

    OK, now that John’s post is complete, let me comment on the sidebar re MA’s strategy to slow down implementation of consortium computerized common core tests, and compare it to California’s strategy to speed it up via the two content area change for 2014. First, I’d note that Louisiana also announced plans this week to slow down their implementation of consortium tests, as reported in EdWeek’s State EdWatch blog yesterday. Re the MA plan, it follows good large scale testing policy and practice to insure instruction is implemented before statewide assessments to measure the common core are implemented, to insure the state has the technology capacity to implement computerized testing, and to insure the new tests are indeed better than the old tests, and to insure continuity of data from old to new via robust comparability studies. These are all time honored principles for transitioning to new statewide assessment systems. California’s actions thus far have put the cart before the horse (assessment implementation before instruction implementation), operate without good data on technology capacity at the local district/school level, and ignore long standing good large scale assessment practices for continuity of data. On SBE Pres Kirst’s comment, certainly I’d agree that CA has gone “all in” with its faith in the Smarter Balanced consortium ability to deliver on all of its promises. The Smarter Balanced track record thus far to deliver on its promises should cause policymakers to question that “all in” faith. And we have to recognize that CA’s previous CST system was built to measure CA’s 1997 standards that objective analyses have said heavily overlap in terms of both content and rigor with the new common core standards, similar to MA’s old standards to new common core standards situation, and that likely 60 to 80 percent of the CST items also actually measure common core standards, tho not with the desired types of item and test administration formats. Thus, from a content perspective, it would have been possible to design short form common core aligned versions of STAR CSTs to bridge a transition period until instruction and technology in CA local districts are ready for full blown implementation of computerized common core tests, including the continuity of data from old to new via robust comparability studies similar to what Massachusetts is doing. This would have been a far more rational strategic approach to transition to a new statewide assessment system. Instead, what SSPI Torlakson and SBE Pres Kirst have supported is a political PR approach to the transition ignoring the facts how the old 1997 standards and the CSTs designed to measure those standards are related to the new common core standards and desired tests to measure those standards, and thus to flush the old tests from the transition. SSPI Torlakson and SBE Pres Kirst have also used “clean break” rhetoric to also support the all-in approach to quicker implementation of Smarter Balanced tests, despite the fact that old-to-new comparability studies were included in SSPI’s original recommendation for a new statewide assessment system earlier this year. The bottom line is that while MA and LA are persuing a rational strategies for implementing new statwide assessments, CA is persuing a heavy-foot-on-the-accelerator strategy, and hoping that local districts can deal with the instructional readiness and technology readiness issues to hang on for the ride.

  7. Doug McRae 10 years ago10 years ago

    As a PS on my comment above about the CA-specific 3-week test administration window for each school, for interested readers you can find more on this change on the State Board of Education agenda material for their Nov 7 meeting, Item # 9, Attachment 1, pg 10 of the scope-of work (or pg 16 of 76 for the overall attachment). The reason for the CA-specific 3-week test administration window appears to be to spread out … Read More

    As a PS on my comment above about the CA-specific 3-week test administration window for each school, for interested readers you can find more on this change on the State Board of Education agenda material for their Nov 7 meeting, Item # 9, Attachment 1, pg 10 of the scope-of work (or pg 16 of 76 for the overall attachment). The reason for the CA-specific 3-week test administration window appears to be to spread out the peak processing load problem for Smarter Balanced’s test administration platform vendor (AIR), to insure CA’s load doesn’t occur all at the same time and cause the system to crash. [Is a reference to the Obamacare IT problems appropriate here? (grin)] The problem is solving the peak load problem on the consortium supplier end causes a technology capacity issue on the local district end . . . .

  8. Suz 10 years ago10 years ago

    Must the field tests be taken online? Or is there still a paper and pencil option during the transition?

  9. Doug McRae 10 years ago10 years ago

    To expand a bit on "a concern over districts' capacity to administer computer tests to both subjects next spring" as cited in the post, the change to mandating two content areas for all students grades 3-8 & 11 effectively doubles the technology required for each district to execute the plan. Also, while the Smarter Balanced field test plan allows for a 12-week test administration window, I suspect it is not widely known that CA-specific plans … Read More

    To expand a bit on “a concern over districts’ capacity to administer computer tests to both subjects next spring” as cited in the post, the change to mandating two content areas for all students grades 3-8 & 11 effectively doubles the technology required for each district to execute the plan. Also, while the Smarter Balanced field test plan allows for a 12-week test administration window, I suspect it is not widely known that CA-specific plans for the 2014 field test require shorter test administration windows for each school — 3-week windows as assigned by the CA-specific “Field Test Participation Plan” that was first made public as part of state board agenda materials on Nov 6. The requirement to test both E/LA and Math content areas and the move to shorter test administration windows combine to generate an 8-fold increase in technology requirements compared to the previous plan for one content area within a 12-week window. This substantial increase in local district technology capability for spring 2014 testing raises the immediate question whether a substantial number of districts/schools in Calfornia will have the technology capacity to execute the new plan.

    Replies

    • navigio 10 years ago10 years ago

      Was the 12 week window created because of a fear of insufficient technology? Or was it something else? Interesting that the 'solution' was to simply cut the duration of each test in half. I wonder how the feds will feel about that. It seems that might compromise its usefulness as a 'field test'? That said, I am not surprised by this move. I did find it interesting that his statement explicitly pointed out that 'no results will … Read More

      Was the 12 week window created because of a fear of insufficient technology? Or was it something else?

      Interesting that the ‘solution’ was to simply cut the duration of each test in half. I wonder how the feds will feel about that. It seems that might compromise its usefulness as a ‘field test’?

      That said, I am not surprised by this move. I did find it interesting that his statement explicitly pointed out that ‘no results will be produced or reported’. Although this was something the feds explicitly said was ok, its interesting that he’s reiterating that.

      • Doug McRae 10 years ago10 years ago

        Oh my . . . my first comment above was caught by John's evolving story. Neither the SSPI press release nor the initial version of the story talked about a shortened form for each field test . . . that info was added when the Zeiger interview information was added to the story after initial publication. If the shortened field tests only total 3 1/2 hours for both content areas, then that will not … Read More

        Oh my . . . my first comment above was caught by John’s evolving story. Neither the SSPI press release nor the initial version of the story talked about a shortened form for each field test . . . that info was added when the Zeiger interview information was added to the story after initial publication. If the shortened field tests only total 3 1/2 hours for both content areas, then that will not double the technology needed. But, the CA-specific 3-week test administration window for the field test will increase the technology requirements by itself . . . . and still raise questions about the number of districts/schools that will have the tech capacity to execute the new plan.