Senior Folds Machine Learning into DNA Origami Research
By Jon Lee AndradeMedia Inquiries
- Associate Dean for Marketing and Communication, MCS
Carnegie Mellon University senior Peter Sauer is offering a technological take on a biological problem.
Sauer, who is majoring in biological sciences and statistics and machine learning is helping researchers predict how complex structures made of DNA form at the smallest scales.
DNA origami is a technique of folding strands of nucleic acids to create nanostructures with applications from drug delivery systems and biosensing to nanorobotics.
"In biology, you think of DNA as these sets of instructions," Sauer said, "From the DNA origami standpoint, it's more of a structural component."
Rebecca Taylor, an associate professor in mechanical engineering, studies how to optimize and generate DNA origami nanostructures. Sauer began working in her lab during his junior year.
"We have a very long strand of single-stranded DNA with lots of short strands we call 'staples,'" Taylor said. "They'll bind to the scaffold and pinch, and when they pinch, they move the scaffold into the shape you want. If you get enough pinching events, you can basically form any arbitrary 2D or 3D shape, like a smiley face, a chair, tweezers or a sphere."
One challenge in the field is getting DNA origami design to create predictable shapes.
Typically, researchers use simulations to predict how a designed DNA origami structure will form.
"But you make really complex super-assemblies," Taylor said. Simulation techniques have improved dramatically over the years, but even so, researchers can only simulate microseconds of a structure's movement. "What that means is that we can't actually simulate the formation of these things, which happens on the timescale of minutes to hours. It's completely outside the range of what's possible."
Through a Summer Undergraduate Research Fellowship grant provided by CMU's Office of Undergraduate Research and Scholar Development, Sauer applied what he knew about machine learning techniques to this biological problem under the mentorship of Weitao Wang, a Ph.D. candidate in Taylor's lab who is working to address the problem.
"Peter has lots of ideas," Wang said. "I can see a lot of innovation from him."
With the literature on DNA nanotechnology growing exponentially, Sauer suggested using published structures to train a machine learning algorithm to discern good from bad designs. He combed the literature and developed a workflow to pull relevant data for nearly 100 published origami structures and store it in a useable and efficient format.
"In class," Sauer said, "there's a lot of 'Here's this dataset. You can apply these methods to it.' And in this project, there's more 'You already know the methods, but how are you going to make the dataset?' It was a very interesting experience because that was a part that I would always take for granted."
One revelation from the work was that misfolding DNA origami structures are rarely included in the literature, meaning machine learning algorithms would not have enough training material for identifying these structures. Nonetheless, Sauer was able to perform preliminary analyses on his new data structures, demonstrating early functionality of machine learning techniques to DNA origami datasets. His work provides the community with critical data and pushes the field closer to the application of machine learning.
"He revealed that there's a big challenge but also a big opportunity," Taylor said. "We'd like to call on the community to include detailed design and characterization data within journal articles but also to deposit this structure information in databases like Nanobase where people are starting to post their good structures. That's essential for the community to learn to make better DNA nanostructures."
Sauer said he enjoyed working with Taylor and Wang on his summer research.
"I learned a lot from interacting with them like how to pivot in projects when things go awry," he said.
Sauer is currently working in the lab of Jose Lugo-Martinez, a Lane Fellow in the Computational Biology Department, and thinking about his future.
"I want to take a gap year first," Sauer said. "Then I want to work at the NIH through their Intramural Research Training Award (IRTA) program. After that, my goal ultimately is to do a MD-Ph.D."
In the meantime, Sauer has been active in other areas of interest. He's a certified EMT and volunteers with CMU EMS. He also plays multiple sports including recreational volleyball, and plans on volunteering next semester to coach shot put and discus throwers at the high school level.