A year ago I had a conversation with Miron Livny (UW Computer Science, Morgridge Institute) about the work we’ve been doing with Docuscope, and he asked an interesting question. “Are there any knobs that can be twisted?” he asked. Livny was alluding to the fact that tagging is a static procedure: once you’ve decided what tokens will be classified as a particular type, your will always get the results you are going to get through counting. Findings are determined the instant you decide what to count, since you are only counting these things. But what about any incremental variables or procedures that might allow us to see what happens when there is more or less of something – a more or less that we, rather than the author, control?
One idea was to systematically begin perturbing the dictionaries used by Docuscope, migrating, say, every nth word from one LAT type to the next, and doing this sequentially until one began to find results that were “more” interpretable. This would be computationally quite demanding and so a further development of our techniques in the direction of high throughput computing. But it might also raise basic questions about the nature of the dictionaries, their susceptibility to random or arbitrary re-disposition, and the sensitivity of our results to such dispositions. One might think of such an experiment as a variation on the Oulipian “N + 7” rule, and there is definitely some connection between this type of computational approach that Hope and I have been calling “iterative criticism” and the exploitation of the arbitrary one finds in Oulipo poetics, or even Burroughsian cut-up and collage.
Mike Stumpf has found a knob to turn – the amount of a particular character’s lines in a play – and has been turning it, with some interesting results. I’ve posted some questions on the post itself, but I think it is an interesting and provocative extension of some of the techniques we have been exploring.