Author: Michael Witmore

An Ecology of Critical Gestures: Point, Circle and Name

In this post I attempt to isolate what I take to be three basic critical gestures that are performed in both “traditional” and “iterative” literary criticism. These gestures are: pointing, circling, and naming.

Before giving examples of these three gestures, a few words about what makes them possible in digital work: establishing a corpus, defining the limits of a single text, saying what will count as a token, and dividing these tokens (usually words or strings of words) into types.

As an example, I take some of the work we’re doing on the Mellon grant, Visualizing English Print (VEP), which is currently focused on a corpus of 1080 texts. To prepare for this study, we have first to establish a corpus of texts, here 1080 items from the EEBO/TCP corpus, each over 500 words long, representing a selection of the full available contents in that corpus. The years represented in the sample range from 1530-1799.

The boundaries and contents of individual texts must also be established: we take each of the items in the corpus and treat the transcribed contents as stable, relying on the manual process of transcription to render the wayward glyph stream that is early modern print into an iterable, stable series of characters. The boundaries of the physical document are now the boundaries of a digital text that, among other things, can be addressed as a container of words.

Next we consider each text as a collection of tokens – words or strings of words – that can be aggregated into subgroups of types. This is a motivated process, since no text will tell you how to group its contents. The decision to group certain tokens into types is just that: a decision or motivated judgment based on a set of interpretive criteria whose result – words grouped into types – is a tagging scheme. (One can use each word as a type, but that is simply a limit case.) If the tagging scheme is that of Docuscope, we will be working with the familiar series of types that Jonathan Hope and I advert to in our analyses: FirstPerson, CommonAuthority, etc.

The number of possible tokenization schemes is potentially infinite. For example, we could type the tokens by grammatical function (noun, verb), linguistic origin (words entering the language after the Norman Conquest), or – to illustrate the point – “all of the words Jonathan Hope really likes” versus “all of the words Mike Witmore really likes.” The Oulipien Georges Perec, on the other hand, might isolate all words missing the letter “e.” All words used by Abraham Lincoln, as opposed to Hillary Clinton. You get the picture.

Based on this tagging scheme, we use mathematics to describe relationships among texts in our corpus, knowing full well that the complexity of the texts has been caricatured through a series of abstractions. Tokens reside – like elements in a logical set – within their types, all of which are present or absent to varying degrees in the individual containers (texts) of our corpus. Represented in this abstract form, we are dealing with extensions of sets and relationships among those sets, none of which are interpretable: these are simply things to be described.

I would call this initial procedure a diagrammatic reduction of the complexity of texts, diagrammatic because the encapsulation of types of words within different texts – when represented mathematically – highlights some things while hiding others. The point of a diagram is not to represent all aspects of the thing diagrammed: its power comes from its ability to illuminate a mode of relation among elements in the thing being represented. I draw a circle in a painting and it represents the sun: the circle is not the sun, and it oversimplifies the edge of the object that we call the sun, since that object has no stable “edge.” But the circle caricatures that object in an interested, useful, and sharable way. What I want you to grasp about the sun, and all other spherical things represented by a circle, is that we can describe its shape in a shorthand: it is one of those things whose edges all sit at the same distance from its center. The diagram is a tool for compression.

We could debate whether or not this particular relation is one of the things that is most important about the sun, debating the correspondence of the diagram with the object. That’s one conversation. But once we begin thinking about the diagram itself, which is a sketch of a set of relations, we might (as I do) feel comfortable talking about the reality of a set of relations. Mathematics is one language in which that reality can be described, and our analysis of texts – and the visualizations we use to explore and conduct such analysis – are just like this circle standing for the sun.

Now think about this snippet below of one of our very large dendrograms, which uses Ward’s clustering algorithm to show the linguistic distances (derived from Docuscope counts) between texts in our corpus. The texts have already been “reduced” to points in a multidimensional space, where each dimension represents a type that Docuscope counts and groups in the individual text-containers. (The types, of course, could have been anything else, and the results would be different.)

First interesting finding: most of the religious prose texts in this corpus, represented in purple, cluster together, but are distinct from another cluster of fictional prose texts. (See Pamela, Ko89369.003 in the olive green above: genre designations were derived from the Helsinki corpus and applied “by hand” to the 1080 items.) Once we begin working at this level of abstraction, we are well into the realm of interpretation, even though we are working with a diagram that is not itself an interpretation. Why? Because we are asking if certain types of words (the ones counted, in this case, by Docuscope) are present or absent in certain kinds of texts (texts identified as being fictional, subdivided into novels and plays).

What are the layers of interpretation, or choices, that could have been otherwise? The first is the selection of items for the corpus and stabilization of the texts, their boundaries, and contents. The second layer is distributed within in the tokenization scheme (Docuscope) that has been used to measure the abstract relationships among types of words, themselves contained to varying degrees within the texts represented here. Interpretive layer three is built into the colors used in the diagram. These colors represent genres or kinds of texts. Without this crucial layer of judgments – the circles draws around sets of texts that say, “these are different from others in a significant way” – we would be left with this:

Now consider doing this kind of work on a corpus of tens of thousands of texts. How would you interpret this very large diagram without any color coding to define groups? You couldn’t. The point is that one must first make a certain powerful critical gesture, drawing a circle around a group of things, if one wants to explore what makes various kinds of texts hang together at the level of the sentence – at the level of repeated or omitted bits of language.

And there is another critical gesture that occurs silently in the color coded diagram: the naming of a group. “These texts are religious prose, those prose fiction.” We have named two of the several kinds of creatures in this population. Do such names exhaust or fully describe the varieties of language present in these groups? Of course not. We employ names of this sort for much the same reason we are drawn to diagrams: because they have a certain compressive force, standing here for a suite of features, even a single feature, that somehow defines a group.

The final gesture in this critical ecology is pointing. I go to a particular text–say, Pamela, highlighted in the diagram below–and make a claim about its exemplarity based either on my own reading, its visible proximity to other texts, or both:

“Like its neighbor Clarissa, Pamela is representative of a kind of text called the novel.” As the statement suggests, I am interested in a certain sort of critical adjacency: what is next to the first exemplary or “classic” work, Clarissa, and why? Pamela, another famous novel, is a near neighbor. I agree they are of a similar kind and even know they are written by the same author. I know too that they are both epistolary. But what else can I say?

Having reached this point, I must now burrow down into exemplarity, making explicit what I mean when using this term (e.g., “prose fiction,” or “the novel”). I must, in other words, reverse the marvelous heuristic work of compression and identify the angle from which my diagrammatic reduction of their contents seemed natural or obvious.

This ability to compress a set of impressions using an example and then expand on what that example represents is utterly basic to our work in the humanities. “Show me your passages,” we say in class, or at a conference. The demand can also be made of an iterative criticism: “Show me the type of words that make this the kind of text it is. Let me see them acting ‘in the wild,’ in an exemplary passage.”

In later discussions, we will try to identify just such an exemplary passage within the circle of texts given the name of “prose fiction.” I pause before doing so to recognize that there is an astonishing amount of comparison underneath even the most simple critical claim such as, “Pamela is a representative work of prose fiction.” This one statement points, circles, and names: the three basic gestures in an ecology of criticism.

Such claims predate computers. Aristotle makes several in the Poetics, for example, when he identifies those features that make a given dramatic plot “necessarily finer” than others. We make the same sorts of claim today, although we can now do so in both discursive and diagrammatic form.

February 17, 2013
New Image from Original Post from Google Books

We had a request for a clearer version of the image we discussed in our post last year, which shows changes in the catalogued subject of Library of Congress books over the course of several hundred years. Jon Orwant from Google was kind enough to send an updated image, which we’re sharing here. I include below his description of the visualization.

“The visualization…was derived exclusively from the metadata feed of the Library of Congress. The LoC catalog doesn’t restrict itself to American or even English language books, but likely does have some sample bias (as every union catalog does).”

February 1, 2013
The Time Problem: Rigid Classifiers, Classifier Postmarks

Here is a thought experiment. Make the following assumptions about a historically diverse collection of texts:

1) I have classified them according to genre myself, and trust these classifications.

2) I have classified the items according to time of composition, and I trust these classifications.

So, my items are both historically and generically diverse, and I want to understand this diversity in a new way.

The metadata I have now allows me to partition the set. The partition, by decade, items, and genre class (A, B, C) looks like this:

Decade 1, 100 items: A, 25; B, 50; C, 25

Decade 2, 100 items: A, 30; B, 40; C, 30

Decade 3, 100 items: A, 30; B, 30; C, 40

Decade 4, 100 items: A, 40; B, 40; C, 20

Each decade is labeled (D1, D2 D3) and each contains 100 items. These items are classed by Genre (A, B, C) and the proportions of items belonging to each genre changes from one decade to the next. What could we do with this collection partitioned in this way, particularly with respect to changes in time?

I am interested in genre A, so I focus on that: how does A’ness change over time? Or how does what “counts as A” change over time? I derive a classifier (K) for A in the first Decade and use this distance metric to arrange all items in this decade with respect to A’ness. So my new description allows me to supply the following information about every item: Item 1 participates in A to this degree, and A’ness means “not being B or C in D1.” Let’s call this classifier D1Ka. I can now derive the set of all classifiers with respect to these metadata: D1Ka, D1Kb, D1Kc, D2Ka, D2Kb, etc. And let’s say I derive a classifier for A using the whole dataset. So we add DKa, DKb, DKc. What are these things I have produced and how can they be used to answer interesting questions?

I live in D1, and am confident I know what belongs to A having seen lots of examples. But I get access to a time travel machine and someone sends me a text written much later in time. It is a visitor from D4, and by my own lights, it looks like another example of A. So, I have projected D1Ka onto an item from D4 and made a judgment. Now we lift the curtain and find that for a person living in D4, the item is not an A but a B. Is my classifier wrong? Is this type of projection illegitimate? I don’t think so. We have learned that classifiers themselves have postmarks, and these postmarks are specific to the population in which they are derived. D1Ka is an *artifact* of the initial partitioning of my data: if there were different proportions of A, B, and C within D1, or different items in each of these categories, the classifier would change.

Experiment two. I live in D4 and I go to a used bookstore, where I find a beautifully preserved copy of an item produced in D1. The title page of the this book says, “The Merchant of Venice, a Comedy.” Nonsense, I say. There’s nothing funny about this repellent little play. So D1Ka fails to classify an A for someone in D4. Why? Because the classifier D4Ka is rigidly determined by the variety of the later population, and this variety is different from that found in D1. When classifiers are themselves rigidly aligned with their population of origin, they generalize in funny ways.

Wait, you say. I have another classifier, namely Ka produced over the entire population, which represents all of the time variation in the dataset of 400 items. Perhaps this is useful for describing how A’ness changes over time? Could I compare D1Ka, D2Ka, D3Kz and D4Ka to one another using DKa as my reference? Perhaps, but you have raised a new question: who, if anyone, ever occupies this long interval of time? What kind of abstraction or artifact is DKa, considering that most people really think 10 years ahead or behind when they classify a book? If we are dealing with 27 decades (as we do in the case of our latest big experiment), we have effectively created a classifier for a time interval that no one could ever occupy. Perhaps there is a very well-read person who has read something from each decade and so has an approximation of this longer perspective: that is the advantage of the durability of print, the capacity of memory, and perhaps the viability of reprinting, which in effect imports some of the variation from an earlier decade into a newer one. When we are working with DKa, everything is effectively written at the same time. Can we use this strange assumption — everything is written at once — to explore the real situation, which is that everything is written at a different time?

Another interesting feature of the analysis. This same type of “all written at the same time” reasoning is occurring in our single decade blocks, since when we create the metadata that allows us to treat a subpopulation of texts and belonging to *a* decade, we once again say they were written simultaneously. We use obvious untruths to get at underlying truths, like an astronomer using the inertial assumption to calculate forces, even though we’ve never seen a body travel in a straight line forever.

If classifiers are artifacts of an arbitrarily scalable partitioning of the population, and if these partitions can be compared, what is the ideal form of “classifier time travel” to use when thinking about how actual writing is influenced by other writing, and how a writer’s memory of texts produced in the past can be projected forward into new spaces? Is there anything to be learned about genre A by comparing the classifiers that can be produced to describe it over time? If so, whose perspective are we approximating, and what does that implied perspective say about our underlying model of authorship and literary history?

If classifiers have postmarks, when are they useful in generalizing over — or beyond — a lifetime’s worth of reading?

April 16, 2012
Google Books: Ratio of Inked Space to Blank Space

How could we create a proxy measure for the relative luxury of a book, and by extension the social prestige of its contents? One way of getting at this might be to measure the ratio of inked to non-inked space for a given work. While the measure is flawed — verse uses less page space, illustrations may sometimes apply more ink across the page — it is at least a starting point. What if Google Books were to publish the ratio of inked to non-inked space for all of the items it has scanned? We could then see how writing of different types, for example, plays or prose fiction, move into larger print formats such as the Folio.

April 14, 2012
What did Stanley Fish count, and when did he start counting it?

We have been observing the reaction to Stanley Fish’s critique of the Digital Humanities with great interest. Here is the full text of our comment, which could only be partially displayed on the New York Times comment window.

You know you’ve come up in the world if you’re being needled by Stanley Fish in The New York Times. Having done our share of work in the data mines, we believe Fish is right to insist that nothing in a text becomes evidence unless you have an interpretation which makes that evidence count. No amount of digital tabulation will substitute for a coherent, defensible reading.

As traditionally trained humanities scholars who use computers to study Shakespeare’s genres, we have pointed out repeatedly that nothing in literary studies will be settled by an algorithm or visualization, however seductively colorful. We have also argued that any pattern found through an iterative, computer-assisted analysis is meaningless without a larger interpretive framework in which to view it. It is the job of literary critics and historians to provide those interpretations, something they do by returning to the text and re-reading it with fresh eyes.

The job of digital tools is to draw our attention to evidence impossible or hard to see during normal reading, prompting us to ask new questions about our texts. This ability to redirect attention and pose new questions is the strong suit of certain kinds of digital humanities research. Indeed, we believe the addition of a digital prosthetic to our insistently human reading complements the skills of close textual analysis that are the staple of literary training. Not everyone in the so-called Digital Humanities community would agree with this position, but we believe the old and new techniques are entirely compatible.

What does it matter why Stanley Fish started minding his ps and bs in Milton? The point is that he has produced a plausible interpretation of Milton’s work based on evidence that fits his larger claim. The fact that an algorithm (“count ps and bs”) has directed his attention to something he hadn’t noticed doesn’t make the resulting pattern gibberish. You bet there are interesting patterns that show up in Milton when you mind his ps and bs. They existed before you counted them, and they exist after. However he found it, Fish has used that patterning to produce an interesting argument about the role of sound in Milton’s prose. And he has the evidence to back this argument up. In the end, he’s doing what most literary critics do in their work: create an interpretation that builds meaningfully on evidence in the text. Is there really any other way?

Yours sincerely,

Jonathan Hope, Strathclyde University

Michael Witmore, Folger Shakespeare Library

You can view a sample of our work at here.

January 27, 2012