From Summer Research to Smarter AI
Media Inquiries
When Aidan Zhang arrived at Carnegie Mellon University, he brought with him a deep curiosity about artificial intelligence and a passion to explore its boundaries. That interest led him to tackle one of the more complex challenges in AI: improving how models generate and refine code. The goal is to boost accuracy, efficiency and adaptability — helping developers build reliable software faster.
Zhang’s work at CMU began in Lei Li(opens in new window)’s lab in the Language Technologies Institute(opens in new window), where he was welcomed into weekly meetings and paired with a Ph.D. mentor. Now a second-year student majoring in AI within the School of Computer Science(opens in new window), Zhang is making strides in research through CMU’s Summer Undergraduate Research Fellowship(opens in new window) (SURF) program.
“I’ve always been fascinated by how large language models work,” Zhang said. “It’s crazy that machines can communicate almost as well as humans. I wanted to understand how that’s even possible.”
A smarter way to teach machines to code
Zhang’s SURF project centers on code generation, a frontier in AI research where large language models (LLMs) are trained to write executable code based on natural language prompts. But instead of relying on traditional training methods, Zhang is experimenting with multiturn reinforcement learning — a technique that allows models to learn iteratively from feedback, much like a human debugging code.
“Coding is inherently a multistep process,” Zhang explained. “You write code, test it, get feedback and refine it. We’re trying to teach models to do the same — rewarding them not just for getting it right the first time, but for improving over multiple attempts.”
The goal? To build a state-of-the-art code generation model that outperforms existing ones and demonstrates the effectiveness of this new training approach.
Challenges and surprises
Despite the promise of multiturn reinforcement learning, Zhang discovered that current models struggle to improve with feedback.
“They often get stuck in loops,” he said. “Even after several rounds of feedback, the improvement is minimal. It’s like they reach their best conclusion early and can’t go beyond it.”
This limitation, Zhang believes, stems from how models are trained — primarily to produce correct answers on the first try, not to integrate feedback and iterate.
From research to real-world impact
Zhang’s interest in hallucinations — when models generate false or misleading information — remains a side passion. He’s even considered turning it into a startup idea.
“If we could build a product that detects and filters hallucinations in LLMs, that would be huge,” he said.
Whether he pursues entrepreneurship or continues in research, Zhang is grateful for the opportunities CMU has provided.
“None of this would’ve been possible without CMU,” he said. “The faculty support, the lab access, the inspiring peers — it’s all shaped what I want to do.”
As Zhang continues refining his model and preparing to submit a paper to academic conferences, he’s already thinking about the future.
“I want to make an impact — whether through a startup or research in industry,” he said. “And I think this project is a great first step.”