Three Universities Join SCS in Yahoo! Cloud Computing Research

Yahoo! Inc. is expanding its successful research partnership with Carnegie Mellon, with three more universities slated to begin using Yahoo’s M45 computing cluster to advance cloud computing research.

The University of California at Berkeley, Cornell University and the University of Massachusetts at Amherst will join Carnegie Mellon in using the M45 cluster, which has approximately 4,000 processor cores and 1.5 petabytes of disk memory. Academic researchers previously have had limited access to such a computing resource. The Yahoo! research partnership thus enables large-scale systems software research and allows academic researchers to explore new applications that analyze Internet-scale data sets, ranging from voting records to online news sources.

“We have been using the Yahoo! cluster for more than a year now and have made significant progress in a number of key research areas, resulting in the publication of more than two dozen academic papers,” said Randal E. Bryant, dean of the School of Computer Science. “Our researchers were able to extract and process documents from the Web in a way that was not possible before, changing the way we think about research problems. We were also able to conduct research over a corpus of 200 million Web pages, processing two orders of magnitude more data.  We conducted systems software research, comparing, for example, the performance of the Hadoop file system and other parallel file systems.

Hadoop is an open source distributed file system and parallel execution environment that runs on the M45 cluster, enabling its users to process massive amounts of data.

“Hadoop powers many of our most broadly used and complex systems at Yahoo!, from Web search to optimizing content for the home page,” said Shelton Shugar, senior vice president of cloud computing at Yahoo!. “Continuing to invest in the open source community and in technologies like Hadoop is an important element in our efforts to drive breakthroughs in Internet-scale computing and ultimately to continually improve the quality of the consumer experience of Yahoo!. By partnering with these top educational institutions to share our M45 cluster and our technical expertise, we hope to further key insights into the next generation of systems software research and development.” 

“The simultaneous access to applications and systems software has been a real benefit,” Bryant said, “and we look forward to our continued partnership with Yahoo! and joint contributions to the cloud computing community.”

Byron Spice