From streaming online video to online shopping to sharing those family photographs on Facebook, the systems that major internet services use to store all their data require more energy than ever before.
In datacenters, the cost of electricity now equals — or surpasses — the cost of computing machines themselves over their typical life. It's a real problem, for both the environment and the bottom-line. Enter Carnegie Mellon.
Researchers at Carnegie Mellon University and Intel Labs Pittsburgh (ILP) have combined low-power, embedded processors typically used in netbooks with flash memory to create a better server architecture. Not only is it fast, but it's far more energy efficient for data-intensive applications than the systems now used.
Datacenters being built today require their own electrical substations and future datacenters may require as much as 200 megawatts of power. An experimental computing cluster based on the researcher's so-called Fast Array of Wimpy Nodes (FAWN) architecture was able to handle 10 to 100 times as many queries for the same amount of energy as a conventional, disk-based cluster. At peak utilization, the cluster operates on less energy than a 100-watt light bulb.
The research team, led by David Andersen, Carnegie Mellon assistant professor of computer science, and Michael Kaminsky, senior research scientist at ILP, recently received a best paper award from the Association for Computing Machinery for its report on FAWN.
"FAWN systems can't replace all of the servers in a datacenter, but they work really well for key-value storage systems, which need to access relatively small bits of information quickly," Andersen said. Key-value storage systems are growing in both size and importance, he added, as ever larger social networks and shopping websites keep track of customers' shopping carts, thumbnail photos of friends and a slew of message postings.
Flash memory is significantly faster than hard disks and far cheaper than dynamic random access memory (DRAM) chips, while consuming less power than either. Though low-power processors aren't the fastest available, the FAWN architecture can use them efficiently by balancing their performance with input/output bandwidth.
In conventional systems, the gap between processor speed and bandwidth has continually grown for decades, resulting in memory bottlenecks that keep fast processors from operating at full capacity even as the processors continue to draw a disproportionate amount of power.
"FAWN will probably never be a good option for challenging real-time applications such as high-end gaming," Kaminsky said. "But we've shown it is a cost-effective, energy efficient approach to designing key-value storage systems and we are now working to extend the approach to applications such as large-scale data analysis."
The work was supported in part by gifts from Network Appliance, Google and Intel Corp., and by a grant from the National Science Foundation. In addition to Andersen and Kaminsky, the research team included Carnegie Mellon Ph.D. computer science students Jason Franklin, Amar Phanishayee and Vijay Vasudevan, and graduate student Lawrence Tan.