Carnegie Mellon University
October 14, 2020

Data Is A Journey, Not A Destination

New course teaches students to interpret how data is collected, depicted and used

By Stacy Kish

Stacy Kish
  • Dietrich College
  • 412-268-9309

Data provide the hard facts to explain anything, from a cat's behavior on catnip to election dynamics to how climate change will affect a given region. While data provides a way to explore the natural environment, it can be manipulated and misused. Emma Slayton and Hannah Gunderman created a course at Carnegie Mellon University to help students distinguish between the truth and lies that lurk in the numbers.

"There was a need crying out for data-management-analysis skills training, and the library is the key place to offer assistance in those areas," said Slayton, the data curation, visualization, and Geographic Information System (GIS) specialist at CMU Libraries. "We created a course that engages students in the process of [questioning] how data is depicted so they have a sense of power and play as they interact with data throughout their lives."

"Discovering the Data Universe" is offered through the Department of Statistics and Data Science in the Dietrich College of Humanities and Social Sciences. The course was designed to challenge students with thought-provoking questions not only about data management but also how data is collected and used.

During the course, students evaluate the source of data and learn to interpret and explain data, while keeping an eye open for how numbers can be manipulated. They accomplish these objectives through hands-on projects. Students work with data and present their findings to their peers during presentations at the end of the course.

Novices Gain Expertise

"Discovering the Data Universe" was designed for students who do not work with data every day. While these concepts may feel intimidating, Slayton and Gunderman have created a safe space for students at all levels to learn and grow.

"As a humanities researcher, I was that person sitting in the introduction to data class and being terrified, wondering if everyone else was terrified too or if it was just me," said Gunderman, the research data management consultant at CMU Libraries. "In this course, we created an environment of inclusivity so students who do not see themselves as a STEM-ey person feels comfortable."

As a way to ease into the course, Gunderman folds pop culture into some of the early discussions. From her previous work, she has found pop culture reduces insecurity students feel as they begin to explore data management concepts. For example, an early assignment has students compare the elements of various Pokémon characters — color, number of appendages, hair, stripes, etc. — to delve into data concepts.

Processing Information Through A Firehose

Data inundates our daily life. The course presents students with an opportunity to evaluate and think critically about data. According to Slayton, data is often portrayed as a neutral entity, but data is a tool. It was collected with a goal. Gunderman and Slayton want to impress on their students that data was collected to address specific questions about the world. With this concept in mind, the students explore strategies to think critically about the data at hand.

"We want students to know that you are allowed to critique [the data] and to ask questions," said Gunderman continues. "[Data] is not objective fact but can be [something we can] dig into. It gives students agency."

Enter the Real World

After running through a series of data management exercises, the students are introduced to different types of data, from quantitative U.S. Census data and Pittsburgh police incident reports to qualitative data like images from the Margaret Morrison Carnegie College for women before it was incorporated into Carnegie Institute of Technology.

The course gives students the freedom to really look at data to explore the problems facing their local communities and society at large. Tumult is roiling many communities. From police brutality to coronavirus, the ability to understand and digest data has never been more critical.  Slayton points to the police incident reports in particular to explore these concepts.

"[People believe] police data must be accurate, but in reality, many institutions do not think critically about data management reports," said Slayton. "We want to ensure students know that a dataset is not in itself a non-biased, untouchable entity. How data is used can also be biased."

Slayton also points to the Census data as another approach they use in the class to evaluate how and why data is collected. While Census data appears like an austere, unimpeachable collection of facts, Slayton points to the citizenship question on this year's survey. The students are asked to ponder how this one question could affect how the data is collected, who responds to the data, and how it affects the overall Census. According to Slayton, all data can be used for a purpose, both positive and negative.

While students navigate the hot-button issues of the day, most are not on campus. Gunderman and Slayton have moved the course online to ensure it is available to the largest community of learners.

"We are building asynchronous activities so our students can feel engaged when they are not face-to-face with us in a Zoom room," said Slayton. "We want to ensure that our students feel like active participates even when they are not with us."

Using Data to Tell A Story

After immersing themselves in data, Gunderman and Slayton help their students discover the stories that lie within. They explore how multiple, valid stories can be drawn from the same dataset.

According to Slayton, storytelling allows students to understand how to put data into context. They can also apply these skills to evaluate common data narrative structures found in social media and other outlets.

"I was interested in learning a little something more, something new, something different," said Lucia Bevilacqua, a junior majoring in neuroscience in the Dietrich College who took the course during the spring 2020 semester. "I learned that working with data is a process. The course has emphasized ways of gathering and analyzing and displaying data, all the layers in this process influence the story that the data tells."

Ultimately, Gunderman and Slayton hope that as students leave the course and enter the real world, they can leverage these skills to navigate a rocky, uncertain data environment to find success.

"This course gives students confidence in their own skills," said Slayton. "They can use these skills to uncover new opportunities to ensure they are full citizens of the data universe."

Gunderman concludes, "Data management is beyond research, it is a life skill."