January 19, 2023
CMU alumnus Kiran Bhat conjures computer vision magic in Hollywood and beyond
By Susie Cribbs
Not long into "Rogue One: A Star Wars Story," the Death Star appears through the viewport of an Imperial Star Destroyer's bridge. Reflected in that viewport is the instantly recognizable face of Grand Moff Tarkin, an Imperial baddie made famous in 1977's "Star Wars: A New Hope" and portrayed with expert villainy by Peter Cushing.
The character slowly turns, and Cushing's face fills the screen with evil cunning.
A narrow chin and clenched jaw. Prominent cheekbones jutting below cold, hard eyes. His bearing rigid, brow furrowed. His face alone says that he will suffer no fools, let alone think twice before destroying his enemies.
But how did Cushing, who died in 1994, star in 2016’s "Rogue One"? What was this magic?
That magic was Carnegie Mellon University alumnus Kiran Bhat, who earned his Ph.D. in robotics from the School of Computer Science in 2004 and has spent his career using computer vision technologies to make animated characters look more lifelike, first at Lucasfilm's Industrial Light and Magic (ILM) and then through his own startup, Loom.ai.
It's tempting to assume that someone who began his career at ILM came to computer vision and graphics through a love of animation. But that's not the case with Kiran, whose first love was robots.
"Robotics started this whole journey," Kiran says. "What sparked my interest in this field is actually trying to figure out how humans and nature move and interact so gracefully in their environments, whereas any man-made systems — even sophisticated systems in the late 1990s — were clunky."
That desire to give robots more lifelike motion inspired Kiran to apply to CMU's graduate program in robotics after he completed his bachelor's degree in India. Once in Pittsburgh, he began studying locomotion in robotics with longtime faculty member Pradeep Khosla — now chancellor of the University of California San Diego.
But Kiran kept coming back to nature, wondering how to observe something in the world, capture the dynamics of its actions, and map those actions to a computer system. Because resources for studying the physical interactions of robots were still limited, Kiran switched to simulations. Computer vision became his passion.
"I realized the right thing to focus on was a controlled, yet rich, simulated environment, and to focus on how to capture an object's parameters and movements by observing the real world," he says.
Kiran gives the example of a juggler.
"If you can point your camera at a juggler, you see the complex dynamics of these objects floating around in the air," he says.
His challenge was automatically determining the parameters of that action through video and creating a realistic simulation.
The path from that research to ILM was simple. ILM had been making movies that relied on puppets (think: Yoda) interacting with humans on screen. But in the early 2000s, they made the decision to go fully digital.
"There was a huge challenge in the industry to figure out how to build systems for rendering and animating characters that felt lifelike, especially when you were putting them beside a human in those shots," Kiran says.
He took an internship at ILM, where he applied a computer vision technique known as structure from motion to solve for the motion and lens parameters from principal camera footage. (A technique, by the way, that CMU pioneered in the 1990s.)
During that internship, he interacted with a broad range of ILM engineers and artists and learned about the key challenges the industry faced in bringing these animations to life. Specifically, he observed the painstaking process of making Yoda's cape look lifelike, especially in scenes where he interacted with the human actor playing Obi-Wan Kenobi. Engineers would set up simulations for the item in question (e.g., the cape), and artists would spend hours fine-tuning their parameters until the animation looked as realistic as possible. They then repeated the arduous process each time the director wanted a change.
Kiran envisioned a world where computer vision made simulations look realistic with less effort and spent his Ph.D. working to make it happen.
After graduation, Kiran returned to ILM full time. Alongside his colleagues, he iterated and fine-tuned a computer vision technology where a camera captured a real-world object and created a digital duplicate that artists could easily edit. Actors — Mark Ruffalo as the Hulk is an excellent example — would wear camera-outfitted helmets that recorded their facial expressions, mannerisms and movement. This recording became a digital photocopy that artists and animators could easily tweak to create whatever emotion or action the director wanted to portray to "make the green guy look like the human," Kiran says of the Hulk.
Which leads back to Peter Cushing.
"It's such a power. The Computer Graphics Lab had an amazing sense of camaraderie. You had people from different classes, schools and divisions working on different projects, but there was a collective understanding of the field and how to push it forward. It was incredible."
When Disney began work on "Rogue One," they had yet another conundrum. Grand Moff Tarkin played an iconic role as commander of the Death Star in the original "Star Wars," and the company didn't want to replace him with a new actor. Was Kiran's technology up to creating a completely digital character that could believably appear on the big screen with a big face beside human actors?
Luckily a digital scan of Cushing's face existed from about 10 years after the original movie, so that helped. The team used a classically trained British actor to stand in for Cushing during the motion capture of his movements and expressions. But it still remained a challenge to create a digital duplicate that artists could easily edit to mimic the late actor's mannerisms.
"That was really the innovation," Kiran says. "If you look at facial movements, your blinks, microexpressions and twitches are unique to who you are. We needed the ability for a skilled artist to change a little bit of that after the motion capture."
That facial performance-capture solving system, as it's called, earned Kiran and his ILM colleagues a 2017 Scientific and Technical Achievement Award from the Academy of Motion Picture Arts and Sciences and permeates much of the animation popular in today's blockbusters.
Similarly, much of CMU permeates Kiran's work.
"CMU felt like we actually do things. We build things, iterate. And that aspect of putting things out, iterating and solving problems is what matters in the real world," Kiran says. "In Hollywood, that's exactly the recipe. The more iterations you go through, the more likely that it's going to be a good product."
But it wasn't just CMU's culture of problem solving that set Kiran up for success. It was also the example his peers and advisers provided for him.
"At least for me personally, the integrity of what I saw the community do was important," Kiran says. "You knew that these were really amazing human beings. There was a lot of integrity and respect in how they did the work, how they presented the work. Those are things you take for granted when you're at CMU."
He praises the university's collaborative nature.
"It's such a power," Kiran says. "The Computer Graphics Lab had an amazing sense of camaraderie. You had people from different classes, schools and divisions working on different projects, but there was a collective understanding of the field and how to push it forward. It was incredible."
Kiran left ILM in 2015 to start Loom.ai, which used then-fledgling deep learning technology to create real-time avatars for 3D games and virtual reality. After almost five years and dozens of partnerships with companies like Qualcomm, VMWare and Samsung — who licensed their technology for millions of mobile devices — Loom.ai was acquired by Roblox, a platform for building immersive worlds and communities.
And while he's happy leading the Loom.ai team as it integrates the latest avatar technology into Roblox, Kiran's also looking at novel approaches for enhancing the platform's creativity. He notes that the scale and variety of user-generated content on Roblox provides unique opportunities for innovations in 3D deep learning.
"There's a lot of exciting innovation going on in this space. AI, especially large language models, and deep learning technologies are things that you'll start experiencing more and more as people get access to 3D," says Kiran, who lives in the Bay Area with his wife who he met in Pittsburgh and his twin daughters. "I intend to pursue more creative avenues that merge 3D and AI."
That doesn't mean that he's forgotten about CMU and SCS. In fact, he still fondly recalls one particular event from his first week on campus.
"Limos came and took us from Smith Hall to the National Robotics Engineering Center, and we got to see these iconic robots I had been dreaming about," he says. "You had all these really old robots — things I had read about and really wanted to see. Maybe it reveals a certain aspect about me but just seeing these was fascinating."
Maybe even magical.