CMU and IBM Collaborate on Open Computing System
For Advancing Research on Question Answering
“Watson” To Play “Jeopardy!” Champions, Feb. 14-16
Professor Eric Nyberg and his Ph.D. students, Nico Schlaefer and Hideki Shima, discuss the Watson project.
PITTSBURGH—Carnegie Mellon University today announced that it is one of eight universities collaborating with IBM to advance the Question Answering (QA) technology behind the IBM “Watson” computing system.
IBM and CMU have collaborated on the Open Advancement of Question-Answering Initiative (OAQA), which encouraged the creation of a computer architecture and methodologies that could be used by researchers to support the Watson system. Watson will compete against “Jeopardy!” champions Ken Jennings and Brad Rutter in a historic “man vs. machine” match that will air on Jeopardy! Feb. 14-16.
“IBM Watson is the first step in how computers will be designed and built differently and will be able to learn, and with the help of Carnegie Mellon we will continue to advance the QA technologies that are the backbone of this system,” said David Ferrucci, leader of the IBM Watson project team.
“The idea that a person could ask a computer a question in standard English and get a specific, accurate and authoritative answer has fired imaginations since the beginning of the computer age,” said Eric Nyberg, a professor in CMU’s Language Technologies Institute. “Despite years of work, major advances in transforming this from science fiction into reality have taken too long. That’s why Carnegie Mellon and IBM together developed this approach we call Open Advancement of Question Answering.”
The OAQA approach includes a modular software architecture and a common set of measurement standards. This enables researchers to test software components they have developed for specific QA tasks in a way that allows a direct comparison with the components of other researchers. This approach, which IBM used as it designed Watson, also allows individual components in the system to be easily swapped out or upgraded as necessary.
“With OAQA, researchers no longer have to reinvent the wheel every time they develop a new QA system,” Nyberg said. QA systems are complex, requiring software components for deciphering natural language, for searching through text documents and for formulating and evaluating potential answers. “In the past, it’s been all but impossible to determine the individual performance of these software components. And it’s been difficult for researchers to build on the success of others by simply plugging in the best available component into their own system. OAQA changes all of that,” he said.
In addition to IBM and CMU, researchers at the Massachusetts Institute of Technology, the University of Texas at Austin, the University of Southern California, Rensselaer Polytechnic Institute, SUNY Albany, the University of Trento and the University of Massachusetts Amherst have participated in the OAQA Initiative. The elements of OAQA are open source and available for download from the OAQA website, http://mu.lti.cs.cmu.edu/trac/oaqa.
“We are glad to be collaborating with such distinguished universities and experts in their respective fields who can contribute to the advancement of QA technologies that help enable the Watson system,” IBM’s Ferrucci said. “The success of the Jeopardy! challenge will break barriers associated with computing technology’s ability to process and understand human language, and will have profound effects on science, technology and business.”
Computer scientists have been designing QA systems for 50 years and IBM and Carnegie Mellon have long been collaborators. This relationship began in 2001, with a federally sponsored program called Advanced Question Answering for Intelligence (AQUAINT). AQUAINT included an investigation of software architectures and tools that could be of general benefit to QA researchers. This sparked an ongoing IBM-CMU collaboration on the development and dissemination of the Unstructured Information Management Architecture (UIMA), an open-source software library that has become a fundamental part of QA systems developed at CMU and IBM.
In 2007, Nyberg and his Ph.D. students, Nico Schlaefer and Hideki Shima, began working with IBM on the Watson project. Realizing that success with Watson could spur new QA research and also support new business applications, IBM and Carnegie Mellon began pioneering the development of OAQA in 2008. Nyberg, Schlaefer and Shima discuss Watson and their contributions on this video, http://www.youtube.com/watch?v=ls2IgNiOftA.
Jeopardy! questions are just one type of QA task. In contrast to general keyword search engines such as Google or Bing, QA systems are designed to provide useful answers to specific questions posed by people using their everyday language. Jeopardy! probes general knowledge and places a premium on speed, but other systems can be configured to provide information about a specific range of products or to provide deep answers to questions on highly technical or legally complex subjects.
The OAQA approach is being used in a Machine Reading project sponsored by the Defense Advanced Research Projects Agency that began last year. IBM, the prime contractor, is working with Carnegie Mellon, the University of Texas at Austin, the University of Southern California and the University of Utah to develop a universal text engine that captures information from natural language texts and stores it in a knowledge base that can be readily accessed by QA and other computer systems.
Follow the School of Computer Science on Twitter @SCSatCMU.