The data and texts found in this post serve as a companion to my article, “Latour, the Digital Humanities, and the Divided Kingdom of Knowledge” which appears in a special issue of New Literary History, 2016, 47:353-375.
The analysis presented in the article is based on a set of texts that were tagged (features were counted) using a tool called Ubiq+Ity, which counts features in texts specified by users or those captured a default feature-set known as Docuscope. The tool’s creation and associated research was funded by the Mellon Foundation under the “Visualizing English Print, 1530-1800” grant.
From this post, users can find the source texts, tagging code, data, and “marked up” Shakespeare plays as HTML documents (documents that show where the features “if” “and” or “but” occur in each of the 38 plays). The source texts were taken from the API created at the Folger Shakespeare Library for the Folger Editions, which are now available online. Thirty-eight Shakespeare plays were extracted from these online editions, excluding speech prefixes and stage directions, and then lightly curated (replacement of smart apostrophe with a regular one, emendation of é to e, insertion of spaces before and after em-dashes). Those texts were then uploaded in a zipped folder to Ubiq+Ity, along with a custom rules .csv that specified the features to be counted in this corpus (if, and, but). Once tagged, Ubiqu+Ity returned a .csv file containing the percentage counts for all of the plays. (I have removed some of the extraneous columns that do not pertain to the analysis, and added the genre medatata discussed in the article.) Ubiq+Ity also returned a set of dynamically annotated texts — HTML files of each individual play — that can be viewed on a browser, turning on and off the three features so that readers can see how and where they occur in the plays. Data from the counts were then visualized in three dimensions using the statistical software package JMP, which was also used to perform Student’s t-test. All of the figures from the article can be found here.