My colleague Mike Gleicher (UW-Madison Computer Science) has been working on a rough and ready visualization of popular words (1-grams) in the Google English Books corpus, which contains several million items. He produced this visualization after working with the dataset for a day. I find the visualization appealing because of what it shows us about […]
Category Archives: Counting Other Things
Lost Books, “Missing Matter,” and the Google 1-Gram Corpus
Google n-grams and Philosophy: Use Versus Mention
Well, the Google n-gram corpus is out, and the world has been introduced to a fabulous new intellectual parlor game. Here are a few searches I ran today which deal with philosophers and philosophical terms: A lot of people are going to be playing with this tool, and I think there are some genuine discoveries […]
Early and Late Plato II: The Apology and The Timaeus
In the previous post we were examining three dimensional clusterings of the Platonic dialogues as rated on scaled Principal Components 1, 2 and 5, a technique that allowed us to see the early Platonic dialogues (as defined by Vlastos) standing apart from the middle and later ones. Vlastos’ claim, we remember, was that these early […]
Platonic Dialogues and the “Two Socrates”
I have been thinking for a while now that Docuscope preserves, in its tagging structure, what a translator preserves — that this is a good definition of what it is looking to classify. One way to test this hypothesis would be to try Docuscope on a set of translations, which is what I’ve tried to […]
Pre-Digital Iteration: The Lindstrand Comparator
I’ve just finished a terrific conference at Loyola organized by Suzanne Gosset on “The Future of Shakespeare’s Text(s).” This photo shows a device, used by one of the conference organizers Peter Shillingsburg, to perform manual collation of printed editions of texts. There is a long tradition of using optical collators to find and identify differences in […]