Farah Karim-Cooper asked us to write something for the Globe to Globe Hamlet site. Here it is.
#MuchAdo #AboutData update 4
Emma Pallant writes:
Many intriguing things to respond to in recent postings by Jonathan and Heather, but I’ll begin with the observation that chimes most clearly with the work we’ve been doing in the rehearsal room this week. The strongest note of recognition comes from the increased usage of terms of address that Jonathan has observed.
To reiterate the point from Jonathan’s post, there is a fairly tight social group presented in Much Ado About Nothing: although we move through the social scale from a don, the Prince of Aragon, down to Hugh Oatcake of the watch, the characters we meet in the play seem fairly comfortable interacting with one another. By that I mean we don’t see any obvious rift between social groups, there are no real strangers in the midst, no-one truly flouting convention, no licensed fool (unless you count Beatrice and Benedick) and no magical creatures or gods to challenge human hubris. Cupid seems to get name-checked as often as any Christian God and in fact, while we are in that realm, it’s interesting to note that there seems to be very little sense of any higher power at all (save for a few Biblical references and the presence of a fairly worldly Friar). What we do have in our story is Don Pedro of Aragon and a small group of soldiers, who arrive at the home of Leonato and his family. They agree to stay a month and everyone seems reasonably comfortable with this arrangement. The ensuing marriages, friendships, confidences and liaisons (actual and possible) during their stay seem ‘easy’, in the sense that they cause no social ructions or conflicts when they are proposed, discussed or realised.
What we see, instead of a particularly broad social sweep, is that the narrowness of this social group seems to create a culture of anxiety, one of jealousy and rivalry, a place of comparison, ambition and social climbing. Claudio’s new honours from the wars are glories born of Don John’s overthrow (potentially even thwarting an attempted coup or challenge to the Prince’s authority), and it seems that this resentment fuels, if not causes, Don John’s actions against Claudio in the thwarting of his marriage. His ally in the plotting, Borachio, is branded as “deformed” (dressing in a way he should not, namely above his social standing) for “going up and down like a gentleman”. Then of course there are the many other disguises and tricks that are enacted during the play, where characters take on the guise of another person, which from the first creates the chief anxiety in the world of the play, that of being supplanted or more specifically cuckolded.
This for the most part is the great big nothing there’s such an ado about. Claudio – even though Don Pedro (his Prince, companion and friend) swears he will woo Hero ‘in his name’, is quick to believe that this friend has actually wooed for himself and is seen to be “of that jealous complexion”. It is therefore no surprise to me that in this world of demob happy men where status is in a state of flux, people are keen to lock down identity where they can and name their roles to one another, to be known by their status or familial connection: Prince, Count, Cousin, Brother, Signor, Father, Master Constable, Friend.
Even in the town scenes where we meet the watch (or more specifically the Prince’s Watch), where the social scale is further compacted, there are tussles over who’s who. To write and read gives one some added status, but beyond this – and perhaps the respect that comes with age – there is no end of wrangling with plenty of intricacies amongst the sirs, sirrahs, friends, neighbours and the corrective “I am a gentleman” from Conrad. These can be played and received with varying degrees on sincerity or mocking of course, as we would today with the use of ‘mate’.
For Beatrice and Benedick and their dance toward marriage, they separately name and reiterate their roles as confirmed batchelors during the first half of the play and spend the second half working out how they might reinvent themselves if they volte face from those roles. Even so, with the fairly constant titles of Lady Beatrice and Signor Benedick many of their verbal parries involve name-calling or rechristening one another – Signor Mountanto, Lady Disdain, the Prince’s Jester – so that their playfulness parodies the society around them.
#MuchAdo #AboutData update 3
The cast are now well into rehearsals, and you can see photos on @PallantPallant‘s twitter feed.
Meanwhile the data miners are about to get on planes to go to the Renaissance Society of America meeting in New York (#rsa14). Here’s a contribution from @heatherfro on the big pronoun question:
This is really interesting for me, though I can’t yet explain why – I suspect there’s something to be said about agency here.
After much I/me discussion between Benedick and Beatrice, we want her to be ending with Benedick by saying we, she doesn’t say we after Act Two, Scene I (2.i.45, 2.1.139; previously there’s one instance of we in 1.i.54). Benedick is the one to say it**, declaring:
Benedick: Come, come, we are friends: let’s have a dance ere we are married, that we may lighten our own hearts and our wives’ heels.
(5.iv.117-118) and this raises more agency questions for me … I wonder what Emma can say about Beatrice’s role, especially at the end of the play?
**Note by JH: my reading of these lines is that the we here is Benedick and Claudio, not Benedick and Beratrice – though the point stands, as Benedick uses a plural pronoun to refer to himself and Beatrice at 5.iv.91-2:
A miracle! Here’s our own hands against our hearts!
He immediately shifts back into the singular as the two re-establish their witty antagonism:
Come, I will have thee, but by this light I take thee for pity.
And Beatrice, perhaps true to character, never shifts:
Beatrice: I would not deny you, but by this good day I yield upon great persuasion – and partly to save your life, for I was told you were in a consumption.
#MuchAdo #AboutData update 2 (scroll down for the intro to these posts)
The previous finding was about the pronoun ‘she’. This one also uses Log-likelihood to identify words used far more frequently in Much Ado than they are in Shakespeare’s other work.
While ‘she’ is the highest scoring word on Log-likelihood, maybe more striking is a run of words that come next:
signor
prince
count
lady
don
cousin
brother
daughter
Aside from the last two, these are all *very* significantly raised: and they clearly share a function/meaning in that they are terms of address. Some (signor, don) might be said to be plot-related, in that they may reflect the particular setting of Much Ado – but that’s not true for most, and the finding is very robust (they are all strongly raised, even those not associated with this particular setting).
My initial explanation for this is that this tracks a relatively unusual format Much Ado has (unusual compared to Shakespeare’s other work): i.e., it depicts a relatively large group of relatively equal social status interacting relatively equally (note the relativelies!) – and does so in lots of prose. My impression is that most other plays focus on smaller groups, and feature interactions up and down the social scale more. There’s an unusually ‘flat’ social structure in Much Ado – and this is reflected in the profusion of address terms. I think there may be a comparison with City comedy to be made here, but that’s for later posts.
I wonder if Emma and the rest of the cast have noticed anything that chimes with this?
#MuchAdo #AboutData update 1
Word frequency findings: Loglikelihood (done with Wordhoard)
Method: this test looks at word frequency in the play, but not simple frequency (which is often not that interesting: words like ‘the’ and ‘and’ are the most frequent in every play).
What it looks for is words that are used more frequently or less frequently in the play than you would expect given Shakespeare’s usage in his other work. The program counts the totals for each word in the play (= ‘the analysis sample’) and compares them to the totals for the same words in all of Shakespeare (= ‘the reference sample’).
It looks for big rises or falls, and adjusts these against the overall frequency of the word to give a score for how unusual the result is, and how likely it is to be due to chance or not (significance). Results unlikely to be due to chance are given stars: highest rating is four stars.
In this test I excluded names, since it isn’t very interesting to find that Shakespeare uses Beatrice more in Much Ado than he does in his other work.
Standout findings
Finding 1.1
The word* with the biggest shift in usage compared to Shakespeare’s norm is the pronoun ‘she/her’ – with a significance rating of four stars, it is raised in this play way over its frequency elsewhere. To give you a sense of how much it is raised, Shakespeare normally uses ‘she’ 53 times every 10,000 words. In Much Ado, he uses it 131 times every 10,000 words.
There are just over 21,000 words in Much Ado. Let’s call that 20,000 for simplicity. This means Shakespeare uses ‘she’ around 262 times in the play. If he was behaving normally, he’d use it 106 times.
This is a big shift – an already frequent word is used two and a half times as often.
Why?
It is tempting to wonder if female characters are more prominent in Much Ado: are there more of them? Do they speak more lines? (We’ll ask Heather Froehlich if she has suggestions re this.)
But this result isn’t necessarily telling us that. What it tells us is that women are referred to more frequently in this play. Maybe it’s just that men talk about women a lot – or maybe men and women talk about women.
Anyway: there’s the first finding. Now over to Emma in the rehearsal room…
*I did this search on ‘lemma’, which automatically includes different forms of the ‘same’ word – so ‘she’ and ‘her’ are counted together.
We’ve posted in the past about advising actors at Shakespeare’s Globe in London on the language of plays they are rehearsing. This is the first in an experimental series of short posts building on that process.
I’m going to be running some analyses on the language of Much Ado About Nothing and discussing the results with Emma Pallant (@PallantPallant), who will be playing Beatrice in a Globe touring production this year (2014).
Emma is an outstanding, and really thoughtful, actor (who crops up on the cover of this book on women making Shakespeare) -so whether I come up with anything interesting or not, the production is sure to be worth seeing. If you are in the UK, or Austria, there’s a chance the production will be close to you at some point in spring/summer (details of venues and ticket booking here).
We’ll also tweet about the data, using the hashtags #MuchAdo #AboutData
As a teaser/taster, click on the image at the top of this post for a quick overview of the distribution of the word ‘she’ across the play. Notice anything?
Jonathan (@wellsheisnt)
I just got back from a fun and very educative trip to Shakespeare’s Globe in London, hosted by Dr Farah Karim-Cooper, who is director of research there.
The Globe stages an annual production aimed at schools (45,000 free tickets have been distributed over the past five years), and this year’s play is A Midsummer Night’s Dream. I was invited down to discuss the language of the play with the cast and crew as they begin rehearsals.
This was a fascinating opportunity for me to test our visualisation tools and analysis on a non-academic audience – and the discussions I had with the actors opened my eyes to applications of the tools we haven’t considered before. They also came up with a series of sharp observations about the language of the play in response to the linguistic analysis.
I began with a tool developed by Martin Mueller’s team at Northwestern University: Wordhoard, as a way of getting a quick overview of the lexical patterns in the play, and introducing people to thinking statistically about language.
Here’s the wordcloud Wordhoard generates for a loglikelihood analysis of MSND compared with the whole Shakespeare corpus:
Loglikelihood takes the frequencies of words in one text (in this case MSND) and compares them with the frequencies of words in a comparison, or reference, sample (in this case, the whole Shakespeare corpus). It identifies the words that are used significantly more or less frequently in the analysis text than would be expected given the frequencies found in the comparison sample. In the wordcloud, the size of a word indicates how strongly its frequency departs from the expected. Words in black appear more frequently than we would expect, and words in grey appear less frequently.
As is generally the case with loglikelihood tests, the words showing the most powerful effects here are nouns associated with significant plot elements: ‘fairy’, ‘wall’, ‘moon’, ‘lion’ etc. If you’ve read the play, it is not hard to explain why these words are used in MSND more than in the rest of Shakespeare – and you really don’t need a computer, or complex statistics, to tell you that. To paraphrase Basil Fawlty, so far, so bleeding obvious.
Where loglikelihood results normally get more interesting – or puzzling – is in results for function words (pronouns, auxiliary verbs, prepositions, conjunctions) and in those words that are significantly less frequent than you’d expect.
Here we can see some surprising results: why does Shakespeare use ‘through’ far more frequently in this play than elsewhere? Why are the masculine pronouns ‘he’ and ‘his’ used less frequently? (And is this linked to the low use of ‘lord’?) Why is ‘it’ rare in the play? And ‘they’ and ‘who’ and ‘of’?
At this stage we started to look at our results from Docuscope for the play, visualised using Anupam Basu’s LATtice.
The heatmap shows all of the folio plays compared to each other: the darker a square is, the more similar the plays are linguistically. The diagonal of black squares running from bottom left to top right marks the points in the map where plays are ‘compared’ to themselves: the black indicates identity. Plays are arranged up the left hand side of the square in ascending chronological order from Comedy of Errors at the bottom to Henry VIII at the top – the sequence then repeats across the top from left to right – so the black square at the bottom left is Comedy of Errors compared to itself, while the black square at the top right is Henry VIII.
One of the first things we noticed when Anupam produced this heatmap was the two plays which stand out as being unlike almost all of the others, producing four distinct light lines which divide the square of the map almost into nine equal smaller squares:
These two anomalous plays are Merry Wives of Windsor (here outlined in blue) and A Midsummer Night’s Dream (yellow). It is not so surprising to find Wives standing out, given the frequent critical observation that this play is generically and linguistically unusual for Shakespeare: but A Midsummer Night’s Dream is a result we certainly would not have predicted.
This visualisation of difference certainly caught the actors’ attention, and they immediately focussed in on the very white square about 2/3 of the way along the MSND line (here picked out in yellow):
So which play is MSND even less like than all of the others? A tragedy? A history? Again, the answer is not one we’d have guessed: Measure for Measure.
This is a good example of how a visualisation can alert you to a surprising finding. We would never have intuited that MSND was anomalous linguistically without this heatmap. It is also a good example of how visualisations should send you back to the data: we now need to investigate the language of MSND to explain what it is that Shakespeare does, or does not do, in this play that makes it stand out so clearly. The visualisation is striking – and it allowed the cast members to identify an interesting problem very quickly – but the visualisation doesn’t give us an explanation for the result. For that we need to dig a bit deeper.
One of the most useful features of LATtice is the bottom right window, which identifies the LATs that account for the most distance between two texts:
This is a very quick way of finding out what is going on – and here the results point us to two LATs which are much more frequent in MSND than Measure for Measure: SenseObject and SenseProperty. SenseObject picks up concrete nouns, while SenseProperty codes for adjectives describing their properties. A quick trip to the LATice box plot screen (on the left of these windows):
confirms that MSND (red dots) is right at the top end of the Shakespeare canon for these LATs (another surprise, since we’ve got used to thinking of these LATs as characteristic of History), while Measure for Measure (blue dots) has the lowest rates in Shakespeare for these LATs.
So Docuscope findings suggest that MSND is a play concerned with concrete objects and their descriptions – another counter-intuitive finding given the associations most of us have with the supposed ethereal, fairy, dream-like atmosphere of the play. Cast members were fascinated by this and its possible implications for how they should use props – and someone also pointed out that many of the names in the play are concrete nouns (Quince, Bottom, Flute, Snout, Peaseblossom, Cobweb, Mote and so on) – what is the effect on the audience of this constant linguistic wash of ‘things’?
Here is a screenshot from Docuscope with SenseObject and SenseProperty tokens underlined in yellow. Reading these tokens in context, you realise that many of these concrete objects and qualities, in this section at least, are fictional in the world of the play. A wall is evoked – but it is one in a play, represented by a man. Despite the frequency of SenseObject in this play, we should be wary of assuming that this implies the straightforward evocation of a concrete reality (try clicking if you need to enlarge):
Also raised in MSND are LATs to do with locating and describing space: Motions and SpaceRelations (as suggested by our loglikelihood finding for ‘through’?). So accompanying a focus on things, is a focus on describing location, and movement – perhaps, someone suggested, because the characters are often so unsure of their location? (In the following screenshot, Motions and SpatialRelation tokens are underlined in yellow.)
Moving on, we also looked at those LATs that are relatively absent from MSND – and here the findings were very interesting indeed. We have seen that MSND does not pattern like a comedy – and the main reason for this is that it lacks the highly interactive language we expect in Shakespearean comedy: DirectAddress and Question are lowered. So too are PersonPronoun (which picks up third person pronouns, and matches our loglikelihood finding for ‘he’ and ‘his’), and FirstPerson – indeed, all types of pronoun are less frequent in the play than is normal for Shakespeare. At this point one of the actors suggested that the lack of pronouns might be because full names are used constantly – she’d noticed in rehearsal how often she was using characters’ names – and we wondered if this was because the play’s characters are so frequently uncertain of their own, and others’ identity.
Also lowered in the play is PersonProperty, the LAT which picks up familial roles (‘father’, ‘mother’, ‘sister’ etc) and social ones (job titles) – if you add this to the lowered rate of pronouns, then a rather strange social world starts to emerge, one lacking the normal points of orientation (and the play is also low on CommonAuthority, which picks up appeals to external structures of social authority – the law, God, and so on).
The visualisation, and Docuscope screens, provoked a discussion I found fascinating: we agreed that the action of the play seems to exist in an eternal present. There seems to be little sense of future or past (appropriately for a dream) – and this ties in with the relative absence of LATs coding for past tense and looking back. As the LATtice heatmap first indicated, MSND is unlike any of the recognised Shakespearean genres – but digging into the data shows that it is unlike them in different ways:
Waiting for my train back to Glasgow (at the excellent Euston Tap bar near Euston Station), I tried to summarize our findings in four tweets (read them from the bottom, up!):
I’ll try to keep in touch with the actors as they rehearse the play – this was a lesson for me in using the tools to spark an investigation into Shakespeare’s language, and I can now see that we could adapt these tools to various educational settings (including schools and rehearsal rooms!).
Jonathan Hope February 2012