Carnegie Mellon University
August 05, 2020

If the Shoe Fits: CSAFE Study Improves Shoe Print Forensics

A new model improves how uncertainty is calculated when matching shoe print evidence, strengthening the validity of this forensics technique

By Stacy Kish

Stacy Kish
  • Dietrich College of Humanities and Social Sciences
  • 412-268-9309

Detectives in a fictional TV series investigate a crime scene. They call over to the forensic tech to take shoe imprints. This will be the evidence that could crack the case.

You have seen this scenario play out in numerous television shows, films and books. The premise is rooted in forensic science, the application of scientific or technical practices in criminal or civil law or regulatory issues. Unfortunately, most forensic techniques outside of DNA evidence lack scientific rigor.

Neil Spencer, a Ph.D., a student in the joint Statistics and Machine Learning doctoral program at Carnegie Mellon University, sought to add clarity to one commonly used forensic technique, shoe print identification.

"People tend to look at [shoeprint impressions] like fingerprints, but existing models in the literature are simple and produced pretty strong conclusions without much investigation into the assumptions behind the model," said Spencer. "We decided to examine the assumptions in this data-scarce field to help us identify gaps in the existing models that could lead to incorrect or overconfident conclusions."

Spencer and his co-author Jared Murray, assistant professor of statistics at the University of Texas at Austin, published their study online in the journal of Annals of Applied Statistics.

Examiners compare shoe print evidence collected at a crime scene to the suspect's shoes using class characteristics (shoe brand, model and size) and accidentals — the unique patterns produced during wear.

Unfortunately, latent crime scene prints are often low quality, reducing accurate identification of accidentals. This technique is further complicated by print variability during the impression-taking process. For this reason, there will always be some uncertainty concerning whether a suspect's shoe truly matches the crime scene print, or if the match is simply a false positive. Experts pair shoe print evidence with a random match probability to communicate the level of uncertainty associated with the results

The strength of shoe print evidence is based on probability — what are the chances that a random shoe would produce the same pattern of accidentals? This sounds reasonable, but the reliability of the match hinges on an accurate model for the spatial distribution of accidentals on the soles of shoes.

"Any quantitative measure of the strength of evidence of a shoeprint match will be highly sensitive to underlying probability models," said Murray, senior author on the paper. "Small changes in modeling assumptions can shift these measures by orders of magnitude, so it is vital to get them right."

Current models use a hypothetical shoe without considering other contributing factors, like shoe shape and the presence of arches. Spencer and Murray developed several models that fold in multiple layers of shoe complexity, from shoe shape to contact surface, to understand the randomness of accidental occurrence. The team based their model on one of the largest in shoeprint databases in the world, an Israeli police database consisting of 400 samples.

The researchers pooled the information across many different types of shoes in order to capture similarities in how contact surfaces affect the distribution of accidentals on the sole. Their models more successfully fit the location of accidentals on shoes than current models.

In this project, the authors sought to address a need to validate forensic techniques after a call by the National Research Council in 2009, which was echoed in 2016 by the President's Council of Advisors on Science and Technology.

While this model is not ready to roll out to crime labs around the country, Spencer believes it is important to have this community weighing the limitations of current methods when developing conclusions. Spencer believes that that as data sources grow, more sophisticated models may capture the relationship between contact surfaces and accidentals more accurately.

This project was funded in part by the Center for Statistics and Applications in Forensic Evidence (CSAFE), a multi-institutional team of researchers aimed at building strong scientific foundations that enhance forensic science and technology practices. CSAFE recently received renewal funding of $20 million through a cooperative agreement from the National Institute of Standards and Technology (NIST) over the next five years to continue working toward developing methods to validate different forensics techniques.

"The stakes are really high in forensic analyses, which are used in court cases," said Robin Mejia, Statistics and Human Rights Program Manager and CMU lead for CSAFE. "Over the first five years, CSAFE has built datasets, and started to develop new methods to analyze different kinds of data. Neil's work is an example of what's needed to create a replicable analysis pipeline."