The Search Engine-Corporate and Institutional Partnerships - Carnegie Mellon University

Revolutionizing Search


While the World Wide Web has changed and grown, one activity has remained constant: searching for information. Searching on the Web has simultaneously become easier, as improved search engines and services have been developed, and more complicated, as the amount of information on the Web grows every day.

Carnegie Mellon researchers have played an important role in developing search technology over the years. Some highlights include the creation of Lycos in 1994, the search engine with its huge directory of documents; the development of Vivisimo, a new software tool that helps make search results more manageable through categorization; and the birth of the ESP Game, an online game created to use human brainpower to label images on the Web for more efficient searching.

Lycos

In July 1994, Carnegie Mellon student, Michael Mauldin, created the Lycos search engine, which contained a directory of 54,000 documents. According to Mauldin, Lycos was named for Lycosidae lycosa, the scientific name for the wolf spider, which doesn't catch its prey in a web but rather by pursuing it.

One month after its release, Lycos contained a directory of 394,000 documents, and by January of 1995 it contained 1.5 million. By November of 1996, Lycos had indexed more than 60 million documents, which was more than any other search engine on the Web at that time. In addition to its huge directory, Lycos was the first to offer short descriptions called "outlines" for each document as well as a match score and a keyword list.

Vivisimo

"Better search results than Google?" reads a recent CNN headline. Could it be possible? Most search engines return too much information, much of it irrelevant. Vivisimo, a new software system developed by Carnegie Mellon researchers, addresses that problem by quickly categorizing the results. The software queries search engines and groups the resulting documents based on their summaries in hierarchical categories. The technology is unique in that it is fully automated and requires no maintenance.

"Vivisimo's 'just-in-time' software technology helps to make sense of the many hundreds of citations returned by search engines exploring the Web," says Raul Valdes-Perez, Vivisimo president and co-founder and senior research computer scientist at Carnegie Mellon. "Just before a user receives a long, tedious list of search results, Vivisimo groups them into a PC-folder-style hierarchy that can be comfortably browsed."

Vivisimo works with Yahoo!, Altavista, MSN, and Lycos as well as some government sites (including FirstGov.gov and PubMed), and corporate, news, and university sites. Vivisimo's free public Web site is located at www.vivisimo.com.

ESP Game

The ESP Game was developed by Luis von Ahn, a graduate student in computer science at Carnegie Mellon. He developed the game to help solve the problems associated with searching for images on the Web.

Currently, search engines use the names of image files and words around them to find images. Computers are not very good at distinguishing images from each other using visual cues. Humans, however, are great at this, and von Ahn's game puts that power to work. While people entertain themselves, they are also helping to label the hundreds of thousands or more images on the Web, which will in turn make searching for images more efficient.

To play, people log on at www.espgame.org, where they are matched randomly with an anonymous partner. Each partner sees the same image, and each one independently lists words that describe or label it. Neither partner can see the other's guesses, but once they've guessed the same word, they move on to the next image. The idea is to agree on the most images possible in two and a half minutes, earning more points for more images.

While the game is no doubt fun, it also serves the larger need of labeling images on the Web. The idea behind this method is what von Ahn calls "Stealing Cycles from Humans," using people's brains collectively like a supercomputer.

The ESP Game comes out of work von Ahn has done called CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart, which is a program that generates and grades tests that humans can pass but computers cannot, including image recognition. The CAPTCHA work has come out of ALADDIN, the National Science Foundation funded Algorithmic Adaptation, Dissemination, and Integration project in Carnegie Mellon's school of computer science.