{"id":1178,"date":"2011-10-21T03:38:00","date_gmt":"2011-10-21T08:38:00","guid":{"rendered":"http:\/\/winedarksea.org\/?p=1178"},"modified":"2025-02-10T17:46:37","modified_gmt":"2025-02-10T22:46:37","slug":"why-the-difference-accounting-for-variation-between-the-folio-and-globe-editions-of-shakespeares-plays","status":"publish","type":"post","link":"https:\/\/winedarksea.org\/?p=1178","title":{"rendered":"Why the Difference? Accounting for Variation between the Folio and Globe Editions of Shakespeare&#8217;s Plays"},"content":{"rendered":"<p>To what extent is modern text analysis software capable of dealing with historical data? This is a perennial question asked by those working with digitized historical texts who wish to see how an analysis of such texts can be facilitated by cutting-edge technologies. No doubt the best way to answer the question is to test this software with two versions of the same text, where one version of the text can be considered an older and noticeably different version than the other version.<\/p>\n<p>Enter the Folio and Globe editions of Shakespeare\u2019s plays. The latter was published in 1867 and contains modernized spelling throughout, whereas the former was published in 1623 and maintains the original spelling of Shakespeare\u2019s Early Modern English. Using DocuScope for text analysis and JMP for statistical visualizations, the following dendrogram was created:<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=1179\" rel=\"attachment wp-att-1179\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-1179\" title=\"ClusterFolioVsGlobe\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2011\/10\/ClusterFolioVsGlobe-182x300.jpg\" alt=\"\" width=\"182\" height=\"300\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2011\/10\/ClusterFolioVsGlobe-182x300.jpg 182w, https:\/\/winedarksea.org\/wp-content\/uploads\/2011\/10\/ClusterFolioVsGlobe-622x1024.jpg 622w, https:\/\/winedarksea.org\/wp-content\/uploads\/2011\/10\/ClusterFolioVsGlobe.jpg 888w\" sizes=\"auto, (max-width: 182px) 100vw, 182px\" \/><\/a><\/p>\n<p>The texts highlighted in red are from the Folio edition, whereas the texts highlight in blue come from the Globe edition. One would expect all of Shakespeare\u2019s Folio plays to cluster with their Globe complement here. <em>Much Ado About Nothing <\/em>is <em>Much Ado About Nothing<\/em>, after all, regardless of which edition it appears in. But for the most part, this neat pairing off is not what happens: instead, most of the Folio plays are grouping with other Folio plays, and the same is true for the Globe plays. Only a few plays are actually grouping with themselves at the top of the dendrogram. Methinks we have a problem.<\/p>\n<p>Upon closer inspection, I found that 13,667 items were tagged by DocuScope in the Globe edition of <em>Much Ado<\/em>, but only 11,382 items were tagged in the Folio edition of the same play: a 16.7% difference. An inspection of eleven other Shakespeare plays provides us with an overall mean difference of 17.8%: a difference that cannot be considered good when it comes to tagging accuracy.<\/p>\n<p>But why the disparity? Maybe a closer look at DocuScope can give us an idea.<\/p>\n<p>First the Folio version of the opening scene in <em>Much Ado About Nothing<\/em> (with the \u201cInterior Thought\u201d and \u201cPublic Values\u201d clusters turned on):<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=1180\" rel=\"attachment wp-att-1180\"><img decoding=\"async\" class=\"alignnone size-full wp-image-1180\" title=\"MuchAdoAboutNothingFOLIO\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2011\/10\/MuchAdoAboutNothingFOLIO.tiff\" alt=\"\" \/><\/a><\/p>\n<p>And the Globe version of the same scene with the same clusters turned on:<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=1181\" rel=\"attachment wp-att-1181\"><img decoding=\"async\" class=\"alignnone size-full wp-image-1181\" title=\"MuchAdoAboutNothingGLOBE\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2011\/10\/MuchAdoAboutNothingGLOBE.tiff\" alt=\"\" \/><\/a><\/p>\n<p>One need not read far to discover what\u2019s (not) being tagged in the older, Folio edition of <em>Much Ado<\/em>: <em>Learne<\/em> versus <em>learn<\/em>. It appears the orthographic rendering of the unstressed final <em>\u2013e<\/em> is causing DocuScope to overlook this work altogether. We find the same mistake later on with <em>indeede\/indeed, kindnesse\/kindness, helpe\/help,<\/em> and <em>kinde\/kind<\/em>. Another common problem is Early Modern use of <em>u<\/em>, which is rendered <em>v<\/em> in modern orthography: <em>deseru\u2019d<\/em> vs. <em>deserved, seruice<\/em> vs. <em>service, <\/em>\u00a0and <em>ouerflow<\/em> vs. <em>overflow<\/em>. There are also a few punctuation issues causing problems: the use of apostrophe (as we see in <em>deseru\u2019d<\/em>) and the use of | (<em>con | flict<\/em> vs. <em>conflict<\/em>), which probably results from some sort of scanning or other computer error. In other plays, the hyphen was also found to be a possible culprit of DocuScope overlooking certain items (<em>ouer-charg\u2019d<\/em> vs. <em>overcharged<\/em>).<\/p>\n<p>Although the overall number of DocuScope omissions on a Folio play is rather large, the actual number of error types is quite small. This gives us hope that, with a bit of modification, it may well be possible to train DocuScope to read non-modern(ized) texts.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>To what extent is modern text analysis software capable of dealing with historical data? This is a perennial question asked by those working with digitized historical texts who wish to see how an analysis of such texts can be facilitated by cutting-edge technologies. No doubt the best way to answer the question is to test [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1178","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts\/1178","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1178"}],"version-history":[{"count":10,"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts\/1178\/revisions"}],"predecessor-version":[{"id":1191,"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts\/1178\/revisions\/1191"}],"wp:attachment":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1178"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1178"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1178"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}