What is DocuScope?

DocuScope is a text analysis environment with a suite of interactive visualization tools for corpus-based rhetorical analysis. The DocuScope Project began in 1998 as a result of collaboration between David Kaufer and Suguru Ishizaki at Carnegie Mellon University. David created what we call the generic (default) dictionary, consisting of over 40 million linguistic patterns of English classified into over 100 categories of rhetorical effects. Suguru designed and implemented the analysis and visualization software, which can annotate a corpus of text against any dictionary of regular strings that are classified into a hierarchy of rhetorical effects. While we designed DocuScope as a tool for rhetorical analysis, we also found that it was extremely effective for developing the dictionary in a systematic fashion.

Theoretical Background

The default dictionary was developed based on David's and Brian Butler's earlier theoretical work in rhetoric (Kaufer & Butler 1996) and their applied work in representational theories of language (Kaufer & Butler 2000). Our latest theoretical framework as well as the overview of the generic dictionary is presented in Power of Words: Unveiling the Speaker and Writer's Hidden Craft (Kaufer, Ishizaki, Butler and Collins 2004). See Collins, Kaufer, Vlachos, Butler, Ishizaki, 2004; Kaufer & Hariman, 2008; Kaufer & Al-Malki, 2009a; and Kaufer & Ishizaki, 2006 for projects that used the generic dictionary.

Domain-Specific Custom Dictionaries

Analysts can also use DocuScope with a domain-specific (i.e., custom) dictionary. They may customize the generic dictionary by using only subsets, or they can create a completely new dictionary. DocuScope allows analysts to systematically explore and create a new domain-specific dictionary. See Al-Malki, Kaufer, Ishizaki, Dreher (forthcoming) and Kaufer & Al-Malki, 2009b for example projects that used custom dictionaries.

Shakespeare Project

Michael Witmore, Director of the Shakespeare Folger Library, and Jonathan Hope of Strathclyde University have used DocuScope for years to analyze Shakespeare and other early modern texts. You can read more about this project in the Early Modern Literary Studies journal article "The Very Large Textual Object: A Prosthetic Reading of Shakespeare." See also the article in Forbes Magazine and the Digital History blog post.

What if I want to use DocuScope on my own Textual Corpus?

Unfortunately, we don't have the resources to support the use of DocuScope outside of our research group and our students. Fortunately, the Working Group for Digital Inquiry at the University of Wisconsin-Madison has received funds from the Mellon Foundation to construct an environment that will allow scholars to have their texts analyzed by a variety of methods, including by the default dictionaries of the DocuScope environment. We will notify people on this site when that infrastructure is ready. If you have an interesting data set that you'd like to analyze with DocuScope, you can contact David Kaufer ( through email and David will give you an assessment as to whether the default dictionaries can add value to your analysis.



Amal, A. M., Kaufer, D., Ishizaki, S., & Dreher, K. (2012). Arab Women in Arab News: Old Stereotypes and New Media. Bloomsbury Academic.

Kaufer, D. & Buter, B. (2000). Designing Interactive Worlds with Words: Principles of Writing as Representational Composition. Routledge.

Kaufer, D. & Butler, B. (1996). Rhetoric and the Arts of Design. Routledge.

Kaufer, D., Ishizaki, S., Butler, B., & Collins, J. (2004). The Power of Words: Unveiling the Speaker and Writer's Hidden Craft. Routledge.


