‘Big data’ classes a big hit in California high schools

In data science classes, students write computer programs to help analyze large sets of data.
Credit: Alison Yin/EdSource

Data science — the study of computer-generated “big data” — is the hottest career in the U.S., according to Glassdoor. And now it’s the hottest math class at a growing number of California high schools.

About 30 high schools in California have started offering data science classes for juniors and seniors, in some cases as an alternative to Algebra 2. A hands-on blend of statistics and computer programming, data science meets the requirements of A-G coursework — the series of classes in English, math, science, foreign language, history and other core subjects necessary for admission to the University of California and California State University systems — and doesn’t require prior knowledge of computers or statistics.

Data science is the study of large sets of data, using computers to look for patterns and trends. In data science classes, students write computer programs that help sort through data and identify regularities — essentially “taking a big data set and dancing around it and getting it to tell you its secrets,” according to math education consultant Tim Erickson, who writes data science curriculum.

And it’s proven to be a popular addition to high school math departments.

“Data science taps into students’ natural reasoning abilities and helps them understand the world,” said Carole Sailer, a math teacher at North Hollywood High School in Los Angeles Unified who teaches two classes of data science and trains other teachers on the topic. “It doesn’t matter what they want to be — a nurse, a police officer — data science exposes students to state-of-the-art technology and helps them develop their powers of reasoning. It really does inspire kids.”

The term “big data” arose in about 2011 as new, inexpensive technology enabled companies to gather vast amounts of data about consumers — what they buy on Amazon, what they eat for breakfast, their favorite movies and other information collected through Internet use, location devices in mobile phones, social media, credit card purchases and other digital footprints.

Companies use the information to predict shopping habits and target advertising to individual consumers. But data science is useful in other fields, as well. Economists, lawyers, engineers and medical researchers use data science to study everything from workplace discrimination trends to safety in self-driving cars to genome research.

“We can now identify patterns and regularities in data of all sorts that allow us to advance scholarship, improve the human condition and create commercial and social value,” according to UC Berkeley’s data science website. “The rise of ‘big data’ has the potential to deepen our understanding of phenomena ranging from physical and biological systems to human social and economic behavior.”

Data scientists are in high demand in the workforce. Starting salaries for data scientists in California are around $100,000, according to Payscale.com, and the job search website Glassdoor ranked data science as the top job in the country for three years in a row, based on salary, job openings and data scientists’ self-reported job satisfaction.

To meet this demand, colleges have been scrambling to create data science programs. California Institute of Technology, Stanford, UC Berkeley, UC Irvine and Cal Poly San Luis Obispo are among the first schools to roll out programs, and more are on the way.

But data science is a natural fit for high schools, as well, said Suyen Machado, an instructional specialist with L.A. Unified who helped write the district’s data science curriculum. Most high schools already offer statistics and computer programming, and the practical, hands-on approach of data science fits in well with the Common Core standards, which emphasize critical thinking.

“The data science curriculum is geared toward helping students become civically engaged,” Machado said. “If you think about it, there’s been an explosion of big data, but not many people have the skills to analyze that data. We teach students how to understand it and think critically about it.”

Using a grant from the National Science Foundation, L.A. Unified and UCLA created a high school data science course that’s used in about 30 high schools in seven Southern California districts. In the course, students collect their own data — such as the number of billboards in their neighborhoods or how many salty snacks they consume — and then write a computer program to analyze the information, looking for quirks, trends and surprises.

They also write programs analyzing “big data” that’s available free online, such as the heights and weights of teenagers in the U.S.

The class has been popular with students, Machado said. In addition to the class having a low attrition rate, 82 percent of students said they’d recommend it to a friend.

Robert Gould, vice-chair of undergraduate studies in the UCLA statistics department, helped write the high school curriculum and is also pushing for a data science program for community colleges.

Because “big data” is so entwined with daily life, all students — regardless of whether they want to pursue it as a career — should learn what it is and how it works, Gould said. And because it’s a booming and still-evolving field, this is an optimal time for students and educators to jump in, he said.

“The field is so new, and there’s so much data that’s unexplained, students can make original discoveries,” he said. “It’s an exciting time. … I tell teachers, ‘Put yourself in a position to get your hands dirty with the data.’ There’s so much out there.“

Erickson, the math education consultant, said that perhaps the best reason to study data science is that it’s fun. One of the lessons he’s crafted has students looking at Bay Area Rapid Transit train ridership figures and using it to guess the Giants home game schedule or the day of the Gay Pride parade.

“These days we are swimming in data, but we don’t have information,” he said. “You see billions of pieces of data and you think, ‘OK, how can I deal with this?’ You want to be a data body surfer. … It can be enormously fun.”

At North Hollywood High, Sailer said teenagers are natural data collectors — looking for patterns among ‘cool’ kids or deciding what’s fashionable, for example — and data science gives them the tools to back up their arguments or reason through a dilemma.

In one lesson, she has students determine if males or females die more often in horror movies. Students write computer codes to look at outcomes and variables to see if the differences are significant or random.

“Students can be put off at first because it’s so different than any class they’ve taken,” she said. “It’s different than statistics because you can play around with the data, change the data, write your own code. We ask, what are these numbers telling us? What can we do with it? It helps students think about the world.”

EdSource in your inbox!

Stay ahead of the latest developments on education in California and nationally from early childhood to college and beyond. Sign up for EdSource’s no-cost daily email.

Subscribe