(Updated Sept 26 with deadline for applying.)

Teachers and others from California have until midnight Friday (Pacific Standard Time)  to sign up for a crowd-sourcing exercise that will help determine how questions will be scored on the new Common Core tests students will take next spring.

The Smarter Balanced Assessment Consortium, the testing organization serving 20 member states, extended the deadline a week to encourage thousands of higher education faculty and K-12 teachers, as well as parents and anyone else interested, to enroll. About 1,400 Californians had signed up by last Friday.

The three-hour online session that participants can do over a two-day period next month is the first level of several score-setting exercises that will determine how students in grades 3 to 8 and grade 11 will be graded on the tests in English language arts and math. Each person who signs up will focus on one grade and one subject.

“This will be the first time that a scoring process will be opened up to such a widespread, participatory process,” said Jacqueline King, the director of higher education collaboration for Smarter Balanced.

Teachers will gain a better understanding of how assessment questions are scored, as well as have a voice in the process, she said.

A more intensive effort will take place in Dallas next month, when 500 teachers, administrators and college faculty, nominated by the Smarter Balanced states, will go through several rounds of group scoring over two days in 35-person panels organized by grade and subject. The collective results from the online scoring exercises will provide guidance and a check for their own judgment, King said.

The Smarter Balanced tests will have four levels of attainment, roughly comparable to the levels on various standardized tests: below basic, basic, proficient and advanced. The achievement level names, which have yet to be decided, will be different to discourage people from comparing the scores of the old and new tests. The Common Core tests will be different from previous state standardized tests as are the academic standards on which the tests are based, Joe Willhoft, executive director of Smarter Balanced, emphasized in a recent webinar.

After an orientation to the standards, the online participants will be given a range of increasingly difficult multiple choice and short-answer questions to examine. Their task will be to determine at what point, based on a set of academic expectations, does the correct answer satisfy Level 3, showing that a student is at grade level and on-track to attain college and career readiness. They will also be given a more involved, multi-step performance task.

The federal government’s aim in funding Smarter Balanced was to create a common test so that it would be possible to compare student scores across states. (Federal officials also funded a second state consortium, the Partnership for Assessment of Readiness for College and Careers, whose dozen member states are mainly in the East.) In November, representatives of the member Smarter Balanced states will vote on whether to accept or adjust the achievement levels, or cut scores, that the experts will recommend to them.

To get more reports like this one, click here to sign up for EdSource’s no-cost daily email on latest developments in education.

Share Article

Comments (15)

Leave a Comment

Your email address will not be published. Required fields are marked * *

Comments Policy

We welcome your comments. All comments are moderated for civility, relevance and other considerations. Click here for EdSource's Comments Policy.

  1. David Patterson 9 years ago9 years ago

    After a full week of hard work at school, my wife and I settled down last night (Friday) and caught up on some online reading. I am a school administrator and my wife is a high school math teacher. I found John's article about being able to participate in the Smarter Balanced scoring setting process was open to midnight Friday and we both were interested in participating. Unfortunately, as I tried to sign us up … Read More

    After a full week of hard work at school, my wife and I settled down last night (Friday) and caught up on some online reading. I am a school administrator and my wife is a high school math teacher. I found John’s article about being able to participate in the Smarter Balanced scoring setting process was open to midnight Friday and we both were interested in participating. Unfortunately, as I tried to sign us up the site told me that the sign up process was closed. We were disappointed.

  2. Peggy 9 years ago9 years ago

    FYI – I just tried to sign up for “crowd sourcing” exercised but message said: “Registration is closed for this event: Online Panel for Achievement Level Setting.” I thought it didn’t close until 11:59PM tonight, Sept 26,2014 PST. But it is 10:19PM PST. It is very disappointing!

    Replies

    • John Fensterwald 9 years ago9 years ago

      That’s unfortunate, Peggy, and contrary to what Smarter Balanced told me. I will pass on the names of you and anyone else who contacts me before midnight our time and request that you be enrolled.

  3. gina 9 years ago9 years ago

    FYI I tried to sign up as a volunteer (9-25-14 Pacific Time) and received this message: “Registration is closed for this event: Online Panel for Achievement Level Setting.” Truly disappointing.

    Replies

    • John Fensterwald 9 years ago9 years ago

      Gina: Try again. I double-checked, and Smarter Balanced reports that the system was down briefly this morning but is back up. You have through 11:59 pm Pacific Time today to sign up.

      • gina 9 years ago9 years ago

        Thank you, John–I received your message and successfully registered!

  4. Doug McRae 9 years ago9 years ago

    To put the Smarter Balanced "cut score" setting exercise this October in context, the achievement levels or cut scores Smarter Balanced states vote for this fall will be "preliminary" cut scores that will need to be validated by spring 2015 Smarter Balanced actual test scores, per the timeline displayed on the SBAC website. The validation process will have to be done late summer 2015 (likely Sept) which means valid 2015 SB scores for students … Read More

    To put the Smarter Balanced “cut score” setting exercise this October in context, the achievement levels or cut scores Smarter Balanced states vote for this fall will be “preliminary” cut scores that will need to be validated by spring 2015 Smarter Balanced actual test scores, per the timeline displayed on the SBAC website. The validation process will have to be done late summer 2015 (likely Sept) which means valid 2015 SB scores for students and schools/districts will not be available until fall 2015.

    The primary reason the cut score setting exercise this October will yield “preliminary” cut scores is that the data from spring 2014 SB item-tryout tests were not collected under conditions necessary for generating valid cut scores, i.e., they were collected for practice purposes only and not under actual computer-adaptive conditions where scores will count. For this reason and several other reasons, the 2014 SB data that will be used for the upcoming October cut score setting exercise are not sufficient for development of valid cut scores until further analyses are done on spring 2015 data which will be collected under appropriate conditions for validating cut scores.

    This being said, Smarter Balanced should be commended for the widespread participation they are encouraging for the 2014 cut score setting exercise. The more people that are involved, the more judgmental perspectives included, the better the process. At the end of the day, however, the final cut scores to be used for 2015 SB tests will have to be validated before they are used for scoring millions of SB tests, and that won’t occur for return of 2015 scores in a timely manner for SB states.

    Also, the SB webinar indicated that after SB members vote to accept SB preliminary achievement levels or cut scores this November, it will be up to each member state’s own governance body [for CA, that is the State Board of Education] to approve or modify the SB recommended cut scores for their own use. Thus, there will be (at least) one additional approval step needed before SB test scores are used by any given SB state.

    Replies

    • Manuel 9 years ago9 years ago

      Doug, the description of the materials that will be made available for review to panel participants says nothing about the items having been used during the Spring 2014 try-out. SBAC claims that "participants will... offer input on achievement levels by reviewing actual test items, ordered by difficulty..." There is nothing in there that says that data from the Spring 2014 tryout will be used. Maybe the test items included in the tryout will be used … Read More

      Doug, the description of the materials that will be made available for review to panel participants says nothing about the items having been used during the Spring 2014 try-out. SBAC claims that “participants will… offer input on achievement levels by reviewing actual test items, ordered by difficulty…” There is nothing in there that says that data from the Spring 2014 tryout will be used. Maybe the test items included in the tryout will be used but there seems to be an implication that only the responses of the panel will count, not what the tryout produced. If indeed SBAC intends to use this data, then you know something that the general public does not know.

      I am, incidentally, disturbed by the repetition of the idea that the panels will define “achievement level scores” (aren’t they the same as the “cutoff” points of old?) when what they are supposed to be doing is reviewing test items and deciding whether or not the items show “mastery at Level 3,” as John has reported. I would think that defining scores would require converting the raw scores into scaled scores, gathering them into distributions, etc.. The information provided makes no mention of this unless, of course, the raw scores are the “new normal.” If so, things will be interesting.

      Also, the web page linked by John informs us that the items (aka Achievement Level Descriptors [ADLs]) have been “developed by K-12 teachers and administrators and higher education faculty from two- and four-year colleges and universities representing Smarter Balanced Governing States” and that “The ALDs are linked to an operational definition of college content-readiness, as well as a policy framework to guide score interpretation for high schools and colleges.” If so, why are panels that presumably will include “average Joes and Janes” being convened to judge the work of these “experts?” Aren’t their qualifications enough? Or does SBAC want a fig leaf for their eventual use?

      • John Fensterwald 9 years ago9 years ago

        Manuel: The panels in Dallas will have the disaggregated results from the online exercises, and they can differentiate the cut score recommendations of teachers from parents and others with no classroom experience, like me. I signed up yesterday. The panels will have the results of the field tests when they do their cut score exercises. They won't do their work in a vacuum. Doug's view, as you know, is that it is improper to use these … Read More

        Manuel: The panels in Dallas will have the disaggregated results from the online exercises, and they can differentiate the cut score recommendations of teachers from parents and others with no classroom experience, like me. I signed up yesterday.

        The panels will have the results of the field tests when they do their cut score exercises. They won’t do their work in a vacuum. Doug’s view, as you know, is that it is improper to use these results, for a number of technical reasons, for setting cut scores.

        • Doug McRae 9 years ago9 years ago

          Manuel -- Spring 2014 SB data come into the cut score setting process in 2 ways: (1) as you say, the items the panels review will be ordered by difficulty and SB will use spring 2014 data to establish the order from easiest to hardest, and panelists will be asked to place dividers (or bookmarks) between the 2 items each panelist thinks is a dividing point between (say) Category 2 and Category 3 (eg, between … Read More

          Manuel — Spring 2014 SB data come into the cut score setting process in 2 ways: (1) as you say, the items the panels review will be ordered by difficulty and SB will use spring 2014 data to establish the order from easiest to hardest, and panelists will be asked to place dividers (or bookmarks) between the 2 items each panelist thinks is a dividing point between (say) Category 2 and Category 3 (eg, between Basic and Proficient); (2) later in the process, after several rounds of placing dividers [first before discussion with other panelists and reviewing the results of the on-line exercise described by John, then after discussion and seeing the results of the on-line exercise], the panelists will be given “impact” data with estimated percent of students falling into each category, with the impact data estimated from spring 2014 SB data.

          As John comments, for several reasons I do not think the spring 2014 SB data is up to the task of yielding valid cut scores — reasons including lack of widespread implementation of common core instruction before the spring 2014 data were collected, data collected on a “for practice only” condition, and data not based on the same test administration context as true computer-adaptive data that will come from the SB 2015 test administration. As a result, even SBAC on their website call the 2014 cut scores “preliminary” to be validated by spring 2015 data before they are final and ready for widespread high stakes use.

          The process of establishing “achievement level scores” [also referred to as “cut scores”] includes judgments involving content and data, and part of the data portion does involve using scale scores and distributions of scale scores. But, human judgments can and frequently do overrule data only considerations, so standards-based test cut scores are distinctly different from any forced distributions such as national norms.

          The Achievement Level Descriptors you mention are content only qualitative descriptions for each performance or achievement category, developed by experts before the panel cut score activity is started. This information is presented to panelists before their first round of bookmark judgments as guidance from experts for the panelist to use. The panelists will not judge the work of the experts who developed the ALD’s, rather the panelists will use the work of the ALD experts as guidance for their own independent judgments.

          John, to reply to you comment briefly, I don’t think it is totally improper to use the spring 2014 data for the exercise that SB is conducting this fall, rather that the exercise will yield only preliminary cut scores not final valid cut scores. It is not unlike gathering polling data to give an advance peek at potential election results. What would be totally improper would be to misuse preliminary cut scores as final valid cut scores and release test results for millions of kids (and aggregates for groups) misrepresented as valid data.

          • Manuel 9 years ago9 years ago

            Thank you, John and Doug, for those clarifications. I am, as a data-based researcher, disappointed, but not surprised, that the process is so subjective as well as an attempt to window-dress the scoring method. I am sure that those who have never participated on something like this will learn something but I am doubtful of their actual impact on the process. I share Doug's position that if SBAC successfully "sells" this exercise as a "we got … Read More

            Thank you, John and Doug, for those clarifications.

            I am, as a data-based researcher, disappointed, but not surprised, that the process is so subjective as well as an attempt to window-dress the scoring method. I am sure that those who have never participated on something like this will learn something but I am doubtful of their actual impact on the process.

            I share Doug’s position that if SBAC successfully “sells” this exercise as a “we got valid cut scores” to the member states then this process is politically tainted and not a true effort to determine what kids have actually learned.

            Oh, well. Let’s hope that it doesn’t turn out that way because otherwise it is a GIGO exercise.

          • Doug McRae 9 years ago9 years ago

            Manuel -- How about FDIFRO for "flawed data in flawed results out." Not as testy as GIGO, but . . . . Actually, the 2014 Smarter Balanced item-tryout data shouldn't be flawed for its primary purpose, that is, to identify qualified items for inclusion in a final summative item pool for the Smarter computer-adaptive test administration algorithm. Unfortunately, if SB uses it for another purpose [that is, to rank order item difficulty and/or to estimate … Read More

            Manuel — How about FDIFRO for “flawed data in flawed results out.” Not as testy as GIGO, but . . . . Actually, the 2014 Smarter Balanced item-tryout data shouldn’t be flawed for its primary purpose, that is, to identify qualified items for inclusion in a final summative item pool for the Smarter computer-adaptive test administration algorithm. Unfortunately, if SB uses it for another purpose [that is, to rank order item difficulty and/or to estimate percent kids above certain “cut scores,” per the standards-setting exercise Smarter is conducting beginning next week], it becomes FDIFRO. However, even the specifics of what 2014 SB data are being used for the standards-setting exercise were not clearly addressed in a CDE meeting for local districts early this week. In response to a question on whether SB used 2014 data from all 4.2 million students who took the SB item-tryout tests last spring or just data from the 20 percent scientific sample identified in advance, the Smarter representative at the meeting had an ambiguous answer that SB just used “enough data for robust item calibrations.” No indication whether the data they actually used met any of the requirements needed for the data to be representative of important subpopulations for the entire consortium set of kids, the reason for scientific sampling in the first place. Doug

          • FloydThursby1941 9 years ago9 years ago

            Don should take note of this. I think this is great that critics of the Common Core can be involved and help develop questions and have input to make sure the questions are fair and do a good job differentiating students into 100 percentiles accurately to encourage hard work, diligence, focus and excellence. There are kids who miss questions intentionally, and this must stop. I think allowing volunteers to comment is good … Read More

            Don should take note of this. I think this is great that critics of the Common Core can be involved and help develop questions and have input to make sure the questions are fair and do a good job differentiating students into 100 percentiles accurately to encourage hard work, diligence, focus and excellence. There are kids who miss questions intentionally, and this must stop. I think allowing volunteers to comment is good so that we can be unified. I think Don should be involved in writing questions for the test. If we oppose certain aspects, we should get involved and make it better. This is the method we will try to catch up to the rest of the world, as a nation, as a people. It is a matter of patriotism that we get this right and don’t have to start over in a couple years with something new which will probably end up equally controversial anyways.

          • Manuel 9 years ago9 years ago

            Doug, you are too kind because if those responses are an indication of where SB is going, the results will not be pretty. Why isn't the legislature paying attention to this travesty? Is it because it is beyond their level of expertise? What do they have those consultants in the payroll for? They already dismissed the warnings given to them in the run-up to the CSTs and now this? Good grief. Therefore, I have no choice … Read More

            Doug, you are too kind because if those responses are an indication of where SB is going, the results will not be pretty.

            Why isn’t the legislature paying attention to this travesty? Is it because it is beyond their level of expertise? What do they have those consultants in the payroll for? They already dismissed the warnings given to them in the run-up to the CSTs and now this? Good grief.

            Therefore, I have no choice but to stick with the ol’ skool term: GIGO. 🙂

          • Don 9 years ago9 years ago

            Manuel, there seems to be several important issues with the roll out between the validity of preliminary data collected while Common Core instruction is widely varying in implementation to several significant technical issues determining cut score validity. Which travesty are you referring to because there seems to be a host of them?