Unlocking Energy Efficient AI

By: Scott Andes

The national conversation on energy and AI centers around increasing our electricity capacity to fuel American computing power. Research at Carnegie Mellon shows a parallel strategy to grow our AI capabilities is by dramatically reducing the energy needs of AI.

Why it matters: Policymakers are speeding ahead to meet AI energy needs, which is the right immediate step. But to win the AI race, the United States needs to be the first to innovate compute strategies that require less power. The stark reality is the energy-hungry applications of AI could cause electricity demands to outstrip supply, potentially causing an electricity crisis in the United States.

But at CMU we think there is another way to look at the problem. Rapid improvements to specialized Large Language Models (LLMs), automating learning systems with AI agents and localized edge computing and other technologies leaving our labs hold the possibility to significantly reduce the energy needs of AI models. Doing so would bend the AI cost curve to America’s economic advantage.

Key Insights: CMU research shows there are several opportunities on the technological horizons that could soon be deployed to reduce the energy needed to run in AI models. These include:

Specialized LLMs: Industry is investing heavily in creating larger, more powerful LLMs. But more specialized models can take on a lot of the workload of larger LLMs. Specialized models — that, for example, might be good at coding but not at composing a song — can be smaller and far more energy efficient. Deploying specialized LLMs represents a significant investment by industry, which will require more work to prove where, and for what problems, they are more effective.
Edge Computing: Increasingly, AI models can run locally (on your phone, tablet or PC) without a data center. Localized “edge computing” is not only far more energy efficient — because queries don’t need to travel from your computer to the data center, run on larger server farms, and back to your computer — but can also be more secure. There are early signs of these models taking off, yet more research needs to be done to build system infrastructure and algorithm co-design to enable portable and effective deployment of AIs on the edge.
AI-driven Automated Learning Systems: The industry releases new generations of GPUs almost every year. For every new generation, hundreds of engineers need to support the building of the overall efficient system frameworks for these environments. There is a great potential to leverage AI agents to automate learning system software developments, reducing the overall product cycle to make the best use of the hardware and, as a result, improve the rate of innovation for energy consumption reduction.

What we’re doing: At CMU, we are pushing these and many other AI strategies forward that can improve the energy efficiency of AI systems. The Catalyst Group at CMU is an interdisciplinary machine learning and systems research group exploring all aspects of the frontiers of AI and computing. In addition, CMU’s Scott Institute for Energy Innovation is supporting research to improve the performance and efficiency of AI infrastructure by a factor of 20.

The big picture: In advanced economies, data centers are projected to drive more than 20% of the growth in electricity us through 2030. And most AI applications can draw from servers globally. While it’s debatable whether the United States is the cheapest and easiest place to build large-scale infrastructure like data centers (China currently has twice the electricity capacity as the U.S.), the U.S. is the unquestionable leader in AI innovation.

Reducing the energy needed to power AI through innovation can provide a durable competitive advantage. There is a real risk the US power grid will not handle future AI demands — and industry can always build more data centers abroad. Reducing energy demands and improving LLM efficiency could also drive down domestic computing costs, unleashing new AI applications for startups, researchers, hospitals and society as a whole.

What’s next: In the pre-PC era, it was famously predicted that six large mainframes, hidden in research labs, would meet the country’s computational needs. The personal computer shattered the old paradigm and unlocked the digital era. High-power LLMs represent a similar paradigm that, sooner or later, innovation will disrupt. The only question is how quickly that paradigm will exist and whether the U.S. leads or follows.