Month: August 2009

Comic Twelfth Night, Tragic Othello (Part III)

One of the aims of this kind of work is to find new things to think about or appreciate in texts that have been analyzed with traditional methods of literary criticism. But one does not always need an outside prompt like statistics to begin exploring counterintuitive ideas about how literary or dramatic texts work. Among traditional literary critics, some very distinguished readers (or auditors) of Shakespeare’s plays have argued that he sometimes builds one type of play on the foundations of another. Susan Snyder, for example, argued in the late 1970s that there is a comic “matrix” underlying Shakespeare’s tragedies. Shakespeare, that is, built some of his tragedies — Othello in particular — on structures that would ordinarily be employed in comedy, and in doing so heightened the emotional effect of downturn in the plays when things deteriorate. There is thus a certain, almost structural irony to Othello. Some of what you see happening on stage seems to evoke the expectations of comedy (and its happy conclusions), but what eventually transpires is the opposite. While this may sound emotionally perverse, I think it is exactly what Shakespeare was up to in Othello, and I’m not surprised that a reader as careful and informed as Snyder was able to figure this out. One of the most interesting consequences of this reading is that we begin to think of genre as something dynamic: a transaction between a spectator and a company that is full of false starts, head fakes, and allusive gestures. Perhaps rather than a recipe or essence, theatrical genre is really an oscillation between certain generic possibilities at a given moment in time

However we choose to think about genre, I think it is safe assume that we never encounter specimens that are “pure to type.” As with the case of illustrators of botanical species, the artist may have one or many individual specimens at hand, but the question is always whether or not to “idealize” or “mix” the specimens in order to depict the ideal type. Such types do not really occur in nature. Or if one settles on a particular example as the ideal, then it will be — strictly speaking — a class of one, since all other specimens will deviate slightly from the illustrated example.

When we turn to the population that is mapped by Docuscope, we see immediately that Othello is not “true to type.” Othello is placed, as perhaps Snyder would have predicted, in the same sector where many comedies gather, a sector that we have labelled comic in keeping with the classifications of Shakespeare’s editors. I repeat the diagram from the earlier post here:

Shakespeare Plays in Scatter Plot rated in Principal Components in R

So, is Docuscope “right” in calling Othello a comedy? Was Snyder “right” in saying that the play was built on a comic “matrix”? Is there anything to be learned from the fact that Docuscope and a particularly distinguished critic agree on where Othello belongs? We should begin thinking about these questions by looking at specific passages. Below is an exchange between Othello and Iago, a dialogue between two individuals that looks a lot like the comic exchanges we examined from Twelfth Night, particularly the exchange between Cesario and Olivia. This is the beginning of what some critics have called the seduction of Othello by Iago, a seduction that culminates in Othello’s kneeling before his former servant in a new misogynistic alliance:

Open Source Shakespeare, Othello 3.1

Docuscope Tagged Othello 3.1

The first thing to notice here is that this is yet another passage in which I/you interaction (blue and red strings) is occurring quickly, at the expense of concrete description. This is what, statistically speaking, is pushing the passage up and to the left in the scatter plot above. If there is a comic matrix here — and not just in the happy set-up of the early acts — it is, from a linguistic point of view, the continued stance that allows a “withholding speaker” (Iago) and an eager listener (Othello) to push back and forth on one another. Othello here is playing the role of Olivia in Twelfth Night, trying to delve further into the thoughts of his interlocutor (which is keeping the I/you, I/thee pronouns coming) while Iago is playing a sort of Cesario, refusing to give the speaker something he wants (and in doing so, goading the speaker on). The parallel is perverse, but it shows that a very different emotional trajectory can take shape on a similar linguistic footing, much as a dancer can perform different body movements on a similar footing or stance.

The next passage deepens the analogy in disturbing ways. In this scene from the fourth act, we have close exchanges between Othello and Desdemona that are structurally similar to to those of the recognition scene in Twelfth Night. Notice how Othello’s complaints echo the type of complaints one hears from a Petrarchan lover, although they emerge from a type of alienation and tragic emotional development that Docuscope can’t count in its perpetual “now.”

Open Source Shakespeare Othello 4.2

Docuscope Tagged Othello 4.2

“What art thou,” Othello asks. And Desdemona answers, “Your wife, my lord; your true / And loyal wife.” Like Viola declaring who she is to Sebastian in Twelfth Night, Desdemona here is reasserting who (not what) she is in the face of something like a disguise that has been forced upon her by the accusations of Iago. She is trying to puncture the veil of Othello’s illusion. Yet, instead of the gladness of recognition, we get a strange catalogue of personal suffering, a lover’s complaint over a loss he has never really suffered. This could, in other words, be a catalogue of suffering that has ended, but instead Shakespeare writes it as a kind of torment that has just begun. Linguistically, it contains all of the strings that Docuscope sees as key in clustering this play together with others we would call comedies. But comic it is not.

What fascinates me about passages that are anti-generic in type is that they show the deep flexibility of anything we might call a structure or matrix on the linguistic, statistical level. There is no “essential structure” of comedy here, since tragedies can exploit the same postures or stances that comedies use to comic effect. This is something a counting machine can “see,” but it is also something that a sensitive critic can see as well. But a critic might not describe that matrix in the way that I have here — as a collection of present and absent linguistic tokens classed by type — and this is where Docuscope begins to throw up new questions about the play, about genre and about reading. When Snyder said that Othello has deep affinities with comedies, was she reacting to the linguistic cues described above? Are these features “co-occurrent” with the more intensive features that she as a critic did read for? What is the nature of this co-occurrence or shared footing of particular linguistic patterns and generic types? And how much anti-typical language can there be in a play of a given type — for example, how much “comic” language can a tragedy like Othello tolerate? Finally, what does this type of linguistic borrowing say about the ways in which genre is staged, cued, and self-consciously manipulated by authors? Would it be self-defeating to say that Othello is a good tragedy because it uses comic linguistic features? This latter claim would, of course, be a matter of interpretation. But it is possible, by splitting up the plays into smaller bits or “chunks” to see how often they stray into other generic territories, and to quantify just how convergent they are with a given anti-type. Here, Othello shares quite a bit with the other comedies in its vicinity, and this high degree of linguistic similarity could be demonstrated quantitatively using something called a dendrogram.

In future posts, we will look more at “outliers,” since this is perhaps an area where we can text what Docuscope sees against what critics would accept or have already asserted. As far as I know, no literary critic has suggested the similarity between Love’s Labour’s Lost and the histories (see below), so this might count as a “discovery” for Docuscope. In the meantime, I will begin posting on the status of these imaginary objects — the texts as coded by Docuscope and arrayed in the two dimensional space of a diagram or map.

August 20, 2009
The Musical Mood of the Country

This morning the New York Times published a story today about a group of mathematicians who are counting types of words in popular songs in order to get a handle on something like the mood of the country. In trying to data-mine mood, they do what all people who count things do: move from something that you can quantify empirically to something that you can’t. We do this as well when we move from “types of words” or Docuscope strings in Shakespeare plays to “genre.” The strings are empirically countable — they are either there in an established corpus or they aren’t — but one must argue for any connection between what is counted and what such counts represent (genre, mood, etc.). The point I have tried to make on this blog is that the connection is interpretive, and so relies on the hermeneutic skills of the one proposing the link.

In the abstract for the paper, recently published in the Journal of Happiness Studies, they write that: “Among a number of observations, we find that the happiness of song lyrics trends downward from the 1960s to the mid 1990s while remaining stable within genres, and that the happiness of blogs has steadily increased from 2005 to 2009, exhibiting a striking rise and fall with blogger age and distance from the Earth’s equator.” This is an interesting finding, particularly the part about blogger age and distance from the equator. One of the selling-points of their analysis is that the data they have obtained is voluntarily supplied, and so perhaps less subject to the social pressures that accompany surveying. I would want to know, on this score, whether a song-title (for example) is subject to other types of pressures. For example, the songwriter is not just “reporting” an inner state by naming a song in a particular way — take the Ramones song, “I Wanna Be Sedated” for example — but offering this title to an audience. Song-names are rhetorical, and so subject to a different set of pressures than “reporting.” There is another kind of self-interference here that doesn’t seem to be taken into account.

One of the lead researchers on the paper, Peter Sheridan Dodds, argues that data supplied voluntarily on the web can serve as a kind of “remote sensor of well-being.” (I remember hearing similar arguments made about baby names a while back; you don’t have to pay for them and they’re important: therefore they are a good measure of national feeling and trends.) For example, teenagers appear to be the least happy because they more frequently use words such as “sick,” “hate” and “stupid.” Wouldn’t it be more interesting to track how the use of these words (or absence of them) compares to groups of populations that teenagers themselves describe as “unhappy?” My inclination here would be to use data-mining techniques to assay and re-describe classifications made by a given social group in terms that they may not necessarily be aware of. Then the factual claim would be: when teenagers describe someone as happy, that person is x% less likely to use words like “sick,” “hate” and “stupid.”

I can imagine the authors of the Music-Mood study making the following set of claims:

Claim 1) Research on web-logs, lyrics and other sources of expression show that words like “sick,” “hate” and “stupid” occur more frequently in a representative group of works by teenagers. This would be the empirical claim.

Claim 2) People who are experiencing a mood such as “well-being” are less likely to mention words like “sick,” “hate,” and “stupid” in unprompted work such as songwriting or blogging. This is an interpretive claim that must be argued for.

Claim 3) Teenagers are less likely than others to be experiencing a mood of well-being. This is logically true if you accept 1 and 2.

Now, what’s interesting about 2 — the interpretive claim — is that it could be made without numbers. In a sense, you either believe this or you don’t. Which begs the question, what exactly are the numerical claims doing in this argument? What if claim 2 is “kind of true,” or “true only among certain people”? Would this mean that “kind of a lot” of teenagers are unhappy?

I would be more comfortable saying that teenagers use more of the following words (“hate,” “stupid”), and that a close look at the contexts in which they use them (which can never be comprehensive) suggests that their use is connected to mood in the following way (e.g., their use allows teenagers to gain social attention by citing negative emotions, their use indicates depression, their use indexes the presence of Goth subculture, etc.). But I would want to know how the words are used rather than simply making inferences from the fact that they occur. The counter-argument here is that the law of large numbers guarantees that even if there is a wide variation of uses of the words (granting, in effect, that not all occurrences are “reports” of mood), there is nevertheless a broad enough pattern to make a generalization. Fair enough, but what numbers are you going to use to make the generalization?

I’m all for the empirical investigation of abstract concepts like happiness, genre, authorial intent. These higher order concepts don’t come from outer space: we create them to capture some suite of characteristics we find in reality or in ourselves. But the Music-Mood analysis lacks a crucial ingredient: an explicit human judgment about the classes that are being measured by the tokens that are being counted. Unless you make that judgment explicit — saying something like “x% of people who experience what persons y and z would describe as ‘well-being’ also produce unprompted work containing these words — you are really just saying that “a lot” of people who we think are happy do this.

Naming something with a word is a way of creating a class of things (as long as that word is not a proper name), and it is classes of things that are correlated quantitatively using statistics: quantities of classes of words in classes of works, for example. In any such analysis, the classes themselves cannot be derived empirically. They have to be specified in advance by appealing to experience, common sense, expertise, or the like. What troubles me about the Musical Mood analysis here is that the rationale for membership in the class of words indicating “well-being” is not spelled out, and perhaps never could be. I would rather ask someone — an expert? a teenager? — to name people who experience well-being and then do one of Matt Jockers’ most-frequent-word analyses on their lyrics or blogs in order to get at the underlying pattern. It’s fine to begin with a set of words whose occurrence indicates (to you) a feeling 0f well-being, but without knowing quantitatively how indicative they are, the numbers are just another kind of adjective. You might as well read a bunch or web pages and decide for yourself.

My guess is that you would conclude that teenagers write like teenagers rather quickly.

August 6, 2009
Comic Twelfth Night, Tragic Othello (Part 2)

Here is a second comic exchange from Twelfth Night. Maria’s plan has worked wonderfully. Malvolio has arrived cross-gartered and is quoting to Olivia little bits of the love letter he believes she has written to him. The blue and red strings, First Person and Interaction, are again appearing fast and thick as the incomprehension builds. As in the previous passage, which dealt with Cesario’s resistance of Olivia, we have a resistant “you” here who keeps the game going. (Had she succumbed, dismissing Maria to go practice her penmanship, the dialogue would look very different: first and second person singular pronouns would most likely disappear.)

A few things worth noting about the coding in this passage. Docuscope is ignoring the single quotation marks from the Moby Shakespeare. It does not matter that these words are being “mentioned” rather than “used” in the Austinian sense: all “sightings” by Docuscope occur in a kind of weird citational indicative: there is no way for the machine to catch the fact that the speaker, Malvolio, is note really telling Olivia “Go to, thou art made.” This is a flat earth in the rhetorical sense: no ironic depth can be perceived when every item is tagged because it occurs, not because its use in a certain context means a certain thing. One should not be mislead about Docuscope’s powers of interpretation here.

Switching analogies, we might say that – like a Spinozan deity – Docuscope contemplates words from the perspective of eternity: it does not itself follow events from the standpoint of a moving present against which it measures temporally marked events as they arrive and withdraw through time. (Docuscope does not engage in phenomenological protention or retention in the Husserlian sense.) Nor does it situate events in space in any perspectivally located way. The history of what happens in the world of the play, if we were to think of it that way, is a history of “mentioned happenings.” No one does anything; rather, words are mentioned, and Docuscope keeps track of which kinds of words are used (but never how).

Another interesting feature of the passage. Malvolio really doesn’t say anything directly to Olivia in this passage: he is talking past Maria, and is reciting to Olivia what he believes she actually wants to say to him. This sort of indirection, when it is not a group effort, also seems to be contributing to the proliferation of Interaction and First Person strings: the “how,” “what,” “what” paired with the “you” “thou” “thou.” We would expect to find a lot of passages like this in other plays that have disguise and supposition, most of all in Comedy of Errors. I suspect that in the future I will be able to put my finger on a number of passages which parallel this one in terms of their performance on the comedy factor that Docuscope found for the full plays.

A final observation. Here and elsewhere in the play, Malvolio is often the one who supplies the Description strings, which as I have mentioned below, this play lacks in comparison with other plays (just as it has more, on average, Interaction and First Person). Is there anything about this passage that shows us why one cannot put one’s weight on both sides of this equation – Description on the one hand, First Person/Interaction on the other – in a single play or passage? Is there something about the comic posture, linguistically, that prevents such combinations? Malvolio and Feste are the two characters in the play who use the most Description strings, and during the fabulous speech in which Malvolio fantasizes about being married to Olivia while Toby and Maria look on, the linguistic texture of the scene is that of a History play. But as principal component analysis tells us, such moments of “historical” writing – oversimplified as the definition is – may occur occasionally in Comedy, but they will not occur repeatedly. Malvolio can only give so many such monologues, and Feste can only produce his rich, descriptive banter for so long.

But isn’t it important that there is a “dash” of Description in the play, indeed, in this passage? One issue that we need to explore as we think about what it means to find “a lot” of something in a particular type of play is what it also means to find “a little” of something. Is there a sense in which things that occur in small amounts are important as well, and if so, how should we think about those “dashes” of a certain type of word?

August 2, 2009