Data Science and Athletics Team To Tackle 2024 NFL Big Data Bowl
By Michael Henninger
Media Inquiries- University Communications & Marketing
- 412-268-1151
Ryan Larsen has coached hundreds of players during his football career — and four data scientists.
When a team of students in the Master of Science in Applied Data Science Program (MADS) at Carnegie Mellon University started a months-long project to parse the massive dataset released by the NFL for its premier, annual sports analytics competition, they called in the coach. The team’s submission, a metric used to evaluate “setting the edge,” was one of two from Carnegie Mellon named a finalist in the 2024 NFL Big Data Bowl.
Formed by MADS student Shane Hauck and under the guidance of alumnus and Assistant Teaching Professor Ron Yurko of the Department of Statistics & Data Science in the Dietrich College of Humanities and Social Sciences, the group of four entered the competition’s coaching-centric track, which encourages teams to pair with actual coaches to develop insightful uses of NFL data.
As finalists, the team was awarded $12,500 and an invitation to the NFL Scouting Combine in Indianapolis at the end of February to present their work.
Expert insight meets next-gen data
The students huddled up with Coach Larsen and his Defensive Coordinator Ben Gibboney to learn as much as they could about setting the edge, where a perimeter defender tries to contain the play by setting an edge that directs the ball carrier toward the center of the field.
Larsen’s simple doodle, sketching the various paths a running back could take and how the edge-setter can influence the runner’s direction, made it all the way to Indianapolis as a slide in the final presentation.
The team's presentation included this slide of Coach Larsen and the doodle at the origin of their project.
“We helped make sense of things from a football perspective, as intricate as the game can be,” said Gibboney. “We’re trying to simplify it for the students, who are absolutely brilliant! They see numbers and data completely differently than us.”
The respect was mutual.
“It’s crazy how much the coaches know about football! They can recognize several aspects of a play immediately after watching just seconds of film. They understand football on such an insane level, which made it so interesting to listen to them,” said Marion Haney, one of the MADS students. “Each time we met we left with a better understanding of the football concept of setting the edge and an idea of how to translate what the coaches said into code.”
Yurko said uniting with the coaches did more than enhance the students’ project. It gave them valuable experience in working with domain experts.
“That’s not just relevant in sports, but relevant across any industry or field where they’re working with an expert and have to use their skills as a data scientist to generate something relevant and insightful,” he said.
The students used the vast trove of NFL positional data from the first nine gamesof the 2022 NFL season to create a metric that can quantify a player’s ability to set the edge.
“What they’ve created is a tool that can really place value on something that’s not seen by the naked eye. It goes to show you what can happen when academics and athletics collaborate together,” Larsen said. “This gives actual data to something that doesn’t show up on the stat sheet.”
Gibboney continued, “Coaches can use this to provide a grade to players they should be valuing higher. They can look at players who might not make a ton of tackles, but are still impacting the game in a major way. As a defensive coach, I see the value in that.”
Practice makes perfect
Once they learned their submission had been chosen as a Big Data Bowl finalist, the student team reconvened with the entire group of Carnegie Mellon football coaches to rehearse and fine-tune the presentation they would ultimately deliver in front of a football audience in Indianapolis.
“You basically have the entire NFL world descend upon this event — every member that's an analytic staffer for a team is at the Big Data Bowl. And they're wanting more,” Yurko said. “They're wanting to know the next steps, and potentially, these students could be people that they ultimately hire.”
According to the NFL, more than 50 Big Data Bowl participants have been hired to date in roles pertaining to data and analytics in sports. Yurko said Carnegie Mellon is well represented in the field.
“Carnegie Mellon has been a leader in sports analytics research for a long time. We’ve become a hub for sports analytics research nationwide,” Yurko said. “And so when people think about what institutions they should go for sports analytics? Carnegie Mellon's at the top.”
Yurko said that the Big Data Bowl chose its five finalists, including the two teams from CMU, out of more than 300 submissions. If the edge-setting metric takes off, the students won’t be the only ones benefiting from the attention of potential employers.
“Our metric gives these players who previously wouldn’t have gotten proper credit for doing their job something that’s valuable and quantifiable,” said Hauck. “We can now measure their impact.”