The translator of these dialogues, Jowett, would have had to preserve at least some of the linguistic “footings” required for such a dialogical structure in the early dialogues, and it was my contention in the previous post that Docuscope would detect these footings because they are exactly what a translator must preserve. Perhaps a more provocative claim, which I would like to advance now, is that the irony which attends this elenctic method — while not itself visible to Docuscope — might also require certain reliable linguistic pivots. In keeping with our analogy of the body of a dancer, certain upper body moves like the ironic twist in which Socrates seems to be asking a question for the sake of clarification but is actually pushing his interlocutor into deeper confusion, require a lower body stance that can support the weight of the move. If we could define this lower body stance, we would not be defining Socratic irony itself, but rather its linguistic correlates. (At some point, the analogy will break down, since language is not a “weight bearing system”: but it does support gestures and turns, so let’s see how far we can go with it.)
What is it exactly that is happening in these early dialogues that Docuscope and principal component Analysis are able to see from afar? Here is a scree plot which rates the power of the principal components as they are derived sequentially, from most powerful to least:
The first two principal components are shown here to be quite powerful: together they account for almost 54% of the variation in the entire corpus. When we rate all of the dialogues on just these first two components, we get the following bubble plot:
I have highlighted the upper left quadrant, where almost all of the dialogues that Vlastos identified as “early” are clustering. Their presence in this quadrant means that they score low on PC1 and high on PC2. PC1 might be described as an anti-early component, because it powerfully discriminates against early dialogues. PC2, on the other hand, might be described as a pro-early component, since its highly loaded variables are more frequently used in early dialogues. We can literally see the sorting power of these two components here, but it can also be quantified by the Tukey text, which was applied to both principal components, the results being available here and here. Note that the Apology is one of the most strongly “early” dialogues by these measures, whereas the Timaeus is one of the least early. We will pay closer attention to these two items as a way of exemplifying the differences that Docuscope sees between the two types of items.
Before making the comparison, let’s look at the variables that are most powerfully loaded on these components and so are most responsible for discriminating the early/non-early difference. We do this either by consulting the loadings of our variables on the two principal components or by looking at a biplot which arrays those variables in two dimensions, exactly the two that were used to produce the bubble plot above. First the loadings scores (reported as eigenvectors) and then the loadings biplot:
The loadings biplot (lower diagram) is a two dimensional image of the loadings scores (upper diagram), showing how these variables behave with respect to one another in the entire corpus. Clusters of words that oppose each other by 180 degrees — for example, [Public_Values] and [Special_Referencing] — tend not to co-occur with one another in the same text. Here we are interested in what makes a particular text cluster in the upper left-hand quadrant, so we are looking for vectors (red arrows) that extend furthest to the left and to the top of the diagram. Vectors extending to the left are: Reasoning, Interactivity, Directing Action, Interior Mind and First Person. (These are the clusters that have significant negative loadings on the first column in the top diagram: if an item scores high on words contained in these clusters, it will be “punished” for that abundance and pushed to the left of the plot, as the red dots are above.) Note that we can also use our 180 rule to say something about items that are far left in the bubble plot as well: they must lack items contained in the clusters that are positively loaded on PC1, which are Narrating, Description, and Time Orientation.
Similarly, with PC2, we are looking for the tall vectors heading upward: Emotion, Public Values and Topical Flow. Having tokens that were counted under these clusters will push an item up in the diagram, as will lacking items from the negatively loaded clusters: Directing Readers, Elaborating, Special Referencing. Note that Topical Flow (which is often populated by third person pronoun use) is loaded positively for the second principal component, but also positively for the first, which makes it fork upward and to the right. This means that an item scoring high on Topical Flow tokens will probably lack some of the items to the far left and contain items to the far right, which may discourage that item’s appearance in our “early” quadrant unless there are differences in these other variables.
I have discussed some of these clusters in earlier posts about Shakespeare, so my main focus here will not be on elaborating the contents of the clusters. Rather, I want to use these loadings to zero in on specific words in exemplary passages from the early and later dialogues to see what is captured and then leave it to readers to say what these particular tokens are doing. Looking at our bubble plot above, the two dialogues that exemplify these opposing linguistic trends — in translation — are the Apology and the Timaeus.
Here are two passages from the Apology that exemplify “earliness” in the Platonic corpus, if we agree that the clustering above seems compelling. Note that these are screenshots from Docuscope in which the clusters that are doing the work of pushing the texts up and to the left are turned on or color coded. I have not turned on the clusters that are absent, since these will be exemplified in the Timaeus:
I think these passages are certainly illustrative of the elenctic method described by Vlastos, although it ought to be said that the high amount of dialogical interaction here — one that was a hallmark of comedy in Shakespearean drama — is sometimes implied by Socrates rather than really enacted by both speakers. That is, Socrates sometimes simulates a dialogue that is not really happening (“to him I may fairly answer”), and this procedure actually multiplies the Interaction strings (sky blue) beyond what might be the case in actual interaction. Note too that Docuscope is seeing lots of Public Values words, words that gesture toward communally sanctioned values, in this earlier style: demigods, heroes, fairly, mistaken, good for, doing right, disgrace. These values must be cited in elenctic exchange because they are the topic of conversation (people have opinions about them), but such implied communality may also coerce assent from an interlocutor for reasons that extend beyond mere shame at self-contradiction. We see, too, more emotionally charged words (in orange); the occasional Topical Flow token (their); and some Reason tokens (if he, thus, may, do not).
Now look at a passage from the Timaeus, which does the things that items in the early quadrant (on the whole) cannot do:
This is cosmogeny, not dialogue, which is why we have a number of Narrative strings (the year when, then, the night, overtaken the, as they) and Description strings (orbit, the moon, stars, sun, wanderings, motion, swiftness). Special Referencing here is picking up a lot of abstract references (dark purple) such as animals, measure, relative, the whole, nature, variety and degrees. The slightly lighter purple, Reporting strings, are complimenting the Narrative tokens: having, completion, After this, came into being, received, to the end that, created. This should not be surprising since the two vectors for these clusters were almost overlapping in the loadings biplot above.
Whereas the Apology is staging a dialogue (real or implied), the Timaeus is creating a world and pacing that act of creation (through narrative) with a set of abstract terms that can be referenced in conversation. Indeed, one of the burdens of this kind of world-making, I think, is that the abstractions must be folded in with the concrete descriptions in equal measure so that the passage is something more than a Georgic description of a natural scene or a praise poem to nature. Note too that there is absolutely no irony in this passage from the Timaeus. That is not because Docuscope has a category that allows it to discern irony in its local environs and so rule out such an effect in the Timaeus: only a human being can make such a discrimination, by virtue of being able to look beyond the simple mentioning of words to assess their use. (For Docuscope, all counted words are mentionings of words whose single use has been classed a priori in the categories assigned to them.)
And yet, even in translation, Docuscope may be identifying the linguistic footings of irony: a necessary but not sufficient condition for its use.