{"id":2271,"date":"2015-07-06T15:44:29","date_gmt":"2015-07-06T20:44:29","guid":{"rendered":"http:\/\/winedarksea.org\/?p=2271"},"modified":"2025-02-10T17:29:14","modified_gmt":"2025-02-10T22:29:14","slug":"finding-distances-between-shakespeares-plays-2-projecting-distances-onto-new-bases-with-pca","status":"publish","type":"post","link":"https:\/\/winedarksea.org\/?p=2271","title":{"rendered":"Finding &#8220;Distances&#8221; Between Shakespeare&#8217;s Plays 2: Projecting Distances onto New Bases with PCA"},"content":{"rendered":"<p>It&#8217;s hard to conceive of distance measured in\u00a0anything other than a straight line. The biplot below, for example, shows the scores of Shakespeare&#8217;s plays on\u00a0the two Docuscope LATs discussed in the <a href=\"http:\/\/winedarksea.org\/?p=2225\">previous post<\/a>, FirstPerson and AbstractConcepts:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2308\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/BScreen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618.png\" alt=\"BScreen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618\" width=\"795\" height=\"785\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/BScreen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618.png 795w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/BScreen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618-300x296.png 300w\" sizes=\"auto, (max-width: 795px) 100vw, 795px\" \/>Plotting\u00a0the items in\u00a0two\u00a0dimensions gives the viewer some general sense of the shape of the data. &#8220;There are more items here, less there.&#8221; But when it comes to thinking about distances between texts, we often measure straight across, favoring either a\u00a0simple line linking two\u00a0items or a line that links\u00a0the perceived centers of groups.<\/p>\n<p>The appeal of the line is strong, perhaps because it is one dimensional. And brutally so. We favor the simple line\u00a0because\u00a0want to see less, not more. Even if we are looking at a biplot,\u00a0we can narrow\u00a0distances to\u00a0one dimension\u00a0by drawing athwart\u00a0the axes. The red lines linking points above \u2014 each the diagonal of a right triangle whose sides are parallel to our axes \u2014\u00a0will be straight and relatively easy to find. The line is simple, but its\u00a0meaning is somewhat abstract\u00a0because it spans\u00a0<em>two<\/em> distinct kinds of distance\u00a0at once.<\/p>\n<p>Distances between items become slightly\u00a0less abstract when things are represented\u00a0in\u00a0an ordered list. Scanning down the &#8220;text_name&#8221; column below, we know that items further down have less of the measured feature that those further up. There is a sequence here and, so, an order of sorts:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2303\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-03-at-9.49.01-AM.png\" alt=\"Screen Shot 2015-07-03 at 9.49.01 AM\" width=\"184\" height=\"718\" \/><\/p>\n<p>If we understand what is being measured, an ordered list can be quite suggestive. This one, for example, tells me that <em>The Comedy of Errors<\/em> has more FirstPerson tokens than <em>The Tempest<\/em>. But it also tells me, by virtue of the way it <em>arranges<\/em> the plays along a single axis, that the more FirstPerson Shakespeare uses in a play, the more likely it is that this play\u00a0is\u00a0a comedy. There are statistically precise ways of saying what &#8220;more&#8221; and &#8220;likely&#8221; mean in the previous sentence, but you don&#8217;t need those measures to appreciate the pattern.<\/p>\n<p>What if I prefer the simplicity of an ordered list, but want nevertheless to work with distances measured in more than one dimension? To get what I want,\u00a0I will have to find some meaningful way of associating the measurements on these two dimensions and, by virtue of that association, reducing them to\u00a0a single measurement on a new (invented) variable. I want distances on a line, but I want to derive those distances from\u00a0more than one type of measurement.<\/p>\n<p>My next\u00a0task, then, will be\u00a0to quantify the <em>joint participation<\/em>\u00a0of these two variables in patterns found across the\u00a0corpus. Instead of looking at both of the received\u00a0measurements (scores on FirstPerson and AbstractConcepts),\u00a0I want to\u00a0&#8220;project&#8221; the information from these\u00a0two axes onto a new, single axis, extracting\u00a0relevant information from both. This projection would be a reorientation of the data on a single new axis, a\u00a0change accomplished by Principal Components Analysis or PCA.<\/p>\n<p>To understand better how PCA works, let&#8217;s continue working with the two LATs\u00a0plotted above. Recall from the <a href=\"http:\/\/winedarksea.org\/?p=2225\">previous post<\/a> that these\u00a0are the Docuscope scores we obtained from <a href=\"http:\/\/vep-test.cs.wisc.edu\/ubiq\/\">Ubiqu+ity<\/a> and put into mean deviation form. A .csv file containing those scores can be found <a href=\"http:\/\/winedarksea.org\/?attachment_id=2274\" target=\"_blank\">here<\/a>. In what follows, we will be feeding those scores into an Excel <a href=\"http:\/\/winedarksea.org\/?attachment_id=2276\" target=\"_blank\">spreadsheet<\/a> and into the open source statistics package &#8220;R&#8221; using <a href=\"http:\/\/winedarksea.org\/?attachment_id=2290\" target=\"_blank\">code<\/a>\u00a0repurposed from a <a href=\"http:\/\/stats.stackexchange.com\/questions\/2691\/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues\" target=\"_blank\">post<\/a> on PCA at Cross Validated by Antoni Parellada.<\/p>\n<p><span style=\"text-decoration: underline;\">A Humanist Learns PCA: The How and Why<\/span><\/p>\n<p>As Hope and I made greater use of unsupervised techniques such as PCA, I wanted a more concrete sense of how it worked. \u00a0But to arrive at that sense, I had\u00a0to learn\u00a0things for which I had no visual intuition. Because I lack\u00a0formal training in\u00a0mathematics or statistics, I spent about two years (in all that spare time) learning\u00a0the ins and outs of\u00a0linear algebra, as well as some techniques from unsupervised learning. I did this with the help of a good\u00a0<a href=\"http:\/\/www.farinhansford.com\/books\/pla\/\">textbook<\/a>\u00a0and a <a href=\"https:\/\/www.khanacademy.org\/math\/linear-algebra\">course<\/a> on\u00a0linear algebra\u00a0at\u00a0Kahn Academy.<\/p>\n<p>Having learned to do PCA &#8220;by hand,&#8221; I have decided here to\u00a0document that process for others \u00a0wanting to try it for themselves. Over the course of this work,\u00a0I came to a\u00a0more intuitive understanding of the key move in PCA, which involves a\u00a0change of basis via\u00a0orthogonal\u00a0projection of the data onto a new axis.\u00a0I spent many months trying to understood what this means, and am now ready to try to explain or illustrate it to others.<\/p>\n<p>My starting point was an\u00a0excellent <a href=\"http:\/\/www.cs.cmu.edu\/~elaw\/papers\/pca.pdf\" target=\"_blank\">tutorial<\/a>\u00a0on PCA\u00a0by Jonathon\u00a0Shlens. Schlens shows why PCA is a good answer to a good question. If I believe that my measurements only incompletely capture the\u00a0underlying dynamics in my corpus, I should be asking\u00a0what new orthonormal bases I can find to\u00a0maximize the variance across those initial measurements and, so, provide better\u00a0grounds for interpretation.\u00a0If this post is successful,\u00a0you will finish it knowing\u00a0(a) why this type of variance-maximizing basis is a useful\u00a0thing to look for\u00a0and (b) what this very useful thing\u00a0looks like.<\/p>\n<p>On the matrix algebra side, PCA can be understood as the projection of the original data onto a new set of orthogonal axes or bases. As documented in the <a href=\"http:\/\/winedarksea.org\/?attachment_id=2276\" target=\"_blank\">Excel spreadsheet<\/a> and the tutorial, the procedure is performed on\u00a0our data matrix,\u00a0X,\u00a0where entries are in mean deviation form (spreadsheet item 1). Our task is then to create a 2&#215;2 a covariance matrix S for\u00a0this original 38&#215;2 matrix X (item 2); find\u00a0the eigenvalues and eigenvectors for this covariance matrix X (item 3); then use this new matrix of orthonormal eigenvectors, P, to accomplish the rotation of\u00a0X (item 4). This rotation of X gives us our\u00a0new matrix\u00a0Y (item 5), which is the\u00a0linear transformation of X\u00a0according to\u00a0the new orthonormal bases contained in P. The individual steps are described in Shlens and reproduced on this spreadsheet in terms that I hope summarize\u00a0his exposition. (I stand ready to make corrections.)<\/p>\n<p><span style=\"text-decoration: underline;\">The Spring Analogy<\/span><\/p>\n<p>In addition to exploring the assumptions and procedures involved in PCA, Shlens offers a suggestive concrete frame or &#8220;toy example&#8221; for thinking about it.\u00a0PCA can be helpful if you want to\u00a0identify\u00a0underlying dynamics that have been\u00a0both captured and obscured by initial measurements of a system. He stages\u00a0a physical analogy, proposing\u00a0the made-up situation in which the true axis of movement of a spring must\u00a0be inferred from haphazardly positioned cameras A, B and C. (That movement is along the X axis.)<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2283\" rel=\"attachment wp-att-2283\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2283\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-6.50.53-AM.png\" alt=\"Screen Shot 2015-07-02 at 6.50.53 AM\" width=\"476\" height=\"380\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-6.50.53-AM.png 476w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-6.50.53-AM-300x239.png 300w\" sizes=\"auto, (max-width: 476px) 100vw, 476px\" \/><\/a><\/p>\n<div class=\"page\" title=\"Page 2\">\n<div class=\"layoutArea\">\n<div class=\"column\">\n<p>Shlens notes that\u00a0&#8220;we often do not know which measurements best reflect the dynamics of our system in question. Furthermore, we sometimes record more dimensions than we actually need!&#8221;\u00a0The idea that the\u00a0axis of greatest variance is also the axis that captures the\u00a0&#8220;underlying\u00a0dynamics&#8221; of the system is an important one, particularly in a situation\u00a0where measurements are correlated. This condition is called multicollinearity. We encounter it in text analysis all the time.<\/p>\n<p>If one is willing to entertain the thought\u00a0that (a) language behaves like a spring across a series of documents and (b) that LATs are like cameras that only imperfectly capture those underlying linguistic\u00a0&#8220;movements,&#8221; then PCA makes sense as a tool for dimension reduction. Shlens makes this point very clearly on page 7, where he notes that PCA works where it works because &#8220;large variances have important dynamics.&#8221; We need to spend more time thinking about what this linkage of variances and dynamics means when we&#8217;re talking about features of texts. We also need to think more about what it means to treat individual documents as <em>observations<\/em> within a larger system whose dynamics they are assumed to express.<\/p>\n<p><span style=\"text-decoration: underline;\">Getting to the Projections<\/span><\/p>\n<p>How might we go about picturing this mathematical process of\u00a0<a href=\"https:\/\/www.khanacademy.org\/math\/linear-algebra\/matrix_transformations\/lin_trans_examples\/v\/introduction-to-projections\" target=\"_blank\">orthogonal projection<\/a>?\u00a0Shlens&#8217;s tutorial focuses on matrix manipulation, which means that it does not help us\u00a0visualize\u00a0how the transformation matrix P assists in the projection of the original matrix onto the new bases. But we want to arrive at a\u00a0more geometrically explicit, and so perhaps intuitive, way of understanding the procedure. So let&#8217;s use the <a href=\"http:\/\/winedarksea.org\/?attachment_id=2290\" target=\"_blank\">code<\/a> I&#8217;ve provided for this post\u00a0to look at the same data we started with. These are\u00a0the\u00a0mean-subtracted values of the Docuscope LATs AbstractConcepts and FirstPerson in the Folger Editions of\u00a0Shakespare&#8217;s plays. <img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2235 size-full\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/06\/Screen-Shot-2015-06-22-at-9.43.18-PM.png\" alt=\"Screen Shot 2015-06-22 at 9.43.18 PM\" width=\"335\" height=\"726\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/06\/Screen-Shot-2015-06-22-at-9.43.18-PM.png 335w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/06\/Screen-Shot-2015-06-22-at-9.43.18-PM-138x300.png 138w\" sizes=\"auto, (max-width: 335px) 100vw, 335px\" \/>To get started, you must place the\u00a0<a href=\"http:\/\/winedarksea.org\/?attachment_id=2274\" target=\"_blank\">.csv file<\/a>\u00a0containing the data above into your R\u00a0working directory, a directory you can change using the\u00a0the\u00a0Misc. tab. Paste the entire text of the <a href=\"http:\/\/winedarksea.org\/?attachment_id=2290\" target=\"_blank\">code<\/a> in the R prompt window and press enter.\u00a0\u00a0Within that window, you will now see several means of calculating the covariance matrix (S) from the initial matrix (X) and then deriving eigenvectors (P) and final scores (Y) using both the automated R functions and &#8220;longhand&#8221; matrix multiplication. If you&#8217;re checking, the results here match those derived from the manual operations documented\u00a0the Excel spreadsheet, albeit with an occasional sign change in P. \u00a0In the Quartz graphic device (a separate window), we will find five\u00a0different images corresponding to five\u00a0different views of this data. You can step through these images by keying\u00a0control-arrow at the same time.<\/p>\n<p>The first view is a centered scatterplot of the measurements above on our received or &#8220;naive bases,&#8221; which are\u00a0our two docuscope LATs. These initial axes already give us important information about distances between texts. I repeat the biplot from the top of the post, which shows that according to these bases, <em>Macbeth<\/em> is the second &#8220;closest&#8221; play to <em>Henry V <\/em>(sitting down and\u00a0to the right of <em>Troilus and Cressida<\/em>, which is first):<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2241\" rel=\"attachment wp-att-2241\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-2241 size-full\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/06\/Screen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618.png\" alt=\"Screen Shot 2015-06-22 at 10.01.43 PM\" width=\"795\" height=\"785\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/06\/Screen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618.png 795w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/06\/Screen-Shot-2015-06-22-at-10.01.43-PM2-e1435899027618-300x296.png 300w\" sizes=\"auto, (max-width: 795px) 100vw, 795px\" \/><\/a><\/p>\n<p>Now we look at the second image, which adds to the plot above a line that\u00a0is the eigenvector corresponding to the highest eigenvalue for the covariance matrix S. This is the line that, by definition, maximizes the\u00a0variance in our two dimensional\u00a0data:<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-2291\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-10.52.32-PM.png\" alt=\"Screen Shot 2015-07-02 at 10.52.32 PM\" width=\"540\" height=\"538\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-10.52.32-PM.png 808w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-10.52.32-PM-150x150.png 150w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-10.52.32-PM-300x300.png 300w\" sizes=\"auto, (max-width: 540px) 100vw, 540px\" \/>You can see that each point is <em>projected<\/em> orthogonally on to this new line, which will become the new basis or first principal component once the rotation has occurred. This maximum is calculated by summing\u00a0the squared distances of each\u00a0the\u00a0perpendicular intersection point\u00a0(where gray meets red) from the mean value at the center of the graph. This red line is like the\u00a0single camera that would &#8220;replace,&#8221; as it were, the haphazardly placed cameras in Shlens&#8217;s toy example. If we agree with the assumptions made by PCA, we infer that this axis represents the\u00a0main dynamic in the system, a\u00a0key &#8220;angle&#8221; from which we can view\u00a0that dynamic at work.\u00a0<a href=\"http:\/\/winedarksea.org\/?attachment_id=2291\" rel=\"attachment wp-att-2291\"><br \/>\n<\/a><\/p>\n<p>The orthonormal assumption makes it easy to plot the next line (black), which is the\u00a0eigenvector corresponding to\u00a0our second, lesser eigenvalue. The measured distances along this\u00a0axis (where gray meets black) represents scores on the second basis or principal component, which by design eliminates correlation with the first. You might think of the\u00a0variance along this line is the uncorrelated &#8220;leftover&#8221; from the that which was\u00a0captured along\u00a0the first new axis. As you can see, intersection points cluster more closely around the mean point in the center of this line than they did around the first:<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2293\" rel=\"attachment wp-att-2293\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2293\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.09.16-PM.png\" alt=\"Screen Shot 2015-07-02 at 11.09.16 PM\" width=\"809\" height=\"808\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.09.16-PM.png 809w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.09.16-PM-150x150.png 150w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.09.16-PM-300x300.png 300w\" sizes=\"auto, (max-width: 809px) 100vw, 809px\" \/><\/a>Now we perform the change of basis, multiplying the initial matrix X by the transformation matrix P. This projection (using the gray guide lines above) onto the new axis is a <em>rotation<\/em> of the original data around the origin. For the sake of explication, I highlight\u00a0the resulting projection along the\u00a0first component in red, the axis that (as we\u00a0remember) accounts for the largest amount of variance:<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2294\" rel=\"attachment wp-att-2294\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2294\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.18.41-PM.png\" alt=\"Screen Shot 2015-07-02 at 11.18.41 PM\" width=\"803\" height=\"802\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.18.41-PM.png 803w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.18.41-PM-150x150.png 150w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.18.41-PM-300x300.png 300w\" sizes=\"auto, (max-width: 803px) 100vw, 803px\" \/><\/a>If we now<em> force all of our dots onto the red line<\/em>\u00a0along their perpendicular gray pathways, we eliminate the\u00a0second dimension (Y axis, or PC2), projecting the data onto a single line, which is the new basis represented by the first principal component.<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2295\" rel=\"attachment wp-att-2295\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2295\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.44.42-PM.png\" alt=\"Screen Shot 2015-07-02 at 11.44.42 PM\" width=\"753\" height=\"752\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.44.42-PM.png 753w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.44.42-PM-150x150.png 150w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-02-at-11.44.42-PM-300x300.png 300w\" sizes=\"auto, (max-width: 753px) 100vw, 753px\" \/><\/a><\/p>\n<p>We can now create a\u00a0list of the plays ranked, in descending order, on this first\u00a0and most principal\u00a0component. This list of distances represents\u00a0the\u00a0reduction of the\u00a0two initial\u00a0dimensions to a single one, a reduction <em>motivated<\/em> by our desire to capture the most variance in a single direction.<\/p>\n<p>How does this projection change the distances among our items? The comparison below shows the measurements, in rank order, of the far ends\u00a0of our initial two variables (AbstractConcepts and FirstPerson) <em>and<\/em>\u00a0of our new variable (PC1). You can see that the plays have been re-ordered and the distances between them changed:<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2296\" rel=\"attachment wp-att-2296\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2296\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-03-at-12.14.04-AM.png\" alt=\"Screen Shot 2015-07-03 at 12.14.04 AM\" width=\"825\" height=\"183\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-03-at-12.14.04-AM.png 825w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/Screen-Shot-2015-07-03-at-12.14.04-AM-300x66.png 300w\" sizes=\"auto, (max-width: 825px) 100vw, 825px\" \/><\/a><\/p>\n<p>Our new basis, PC1, looks like it is capturing some dynamic that we might connect to the what the creators of the First Folio (1623) labeled as\u00a0&#8220;comedy.&#8221; When we look at similar ranked lists for our initial two variables, we see that individually they too seemed to be connected with &#8220;comedy,&#8221; in the sense that a relative <em>lack<\/em> of one (AbstractConcepts) and an <em>abundance<\/em> of the other (FirstPerson) both seem to\u00a0contribute to a play&#8217;s being labelled a comedy. Recall\u00a0that these two variables showed a negative covariance in the initial analysis, so this finding is unsurprising.<\/p>\n<p>But what PCA has done is combined these two variables into a new one, which is a linear combination of the scores according to weighted\u00a0coefficients (found in\u00a0the first eigenvector). If you are low on this new variable, you are likely to be a comedy. We might want to come up with a name for PC1, which represents the combined, re-weighted power of the first two variables. If we call it the &#8220;anti-comedy&#8221; axis \u2014\u00a0you can&#8217;t be comic if you have a lot of it! \u2014\u00a0then we&#8217;d be <em>aligning<\/em> the sorting power of this new projection with what literary critics and theorists call &#8220;genre.&#8221; Remember that by aligning these two things\u00a0is not the same as\u00a0saying\u00a0one is the cause of the other.<\/p>\n<p>With a sufficient sample size, this\u00a0procedure for reducing dimensions could be\u00a0performed on a\u00a0dozen measurements or variables, transforming that naive set of bases into principal components that (a) maximize the variance in the data and, one hopes, (b) call attention to the dynamics expressed in texts conceived as\u00a0&#8220;system.&#8221; If \u00a0you see PCA performed on three\u00a0variables rather than\u00a0two, you should imagine the variance-maximizing-projection above repeated with a plane in the three dimensional space:<\/p>\n<p><a href=\"http:\/\/winedarksea.org\/?attachment_id=2317\" rel=\"attachment wp-att-2317\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-2317\" src=\"http:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/orthoregdemo_02.png\" alt=\"orthoregdemo_02\" width=\"560\" height=\"420\" srcset=\"https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/orthoregdemo_02.png 560w, https:\/\/winedarksea.org\/wp-content\/uploads\/2015\/07\/orthoregdemo_02-300x225.png 300w\" sizes=\"auto, (max-width: 560px) 100vw, 560px\" \/><\/a><\/p>\n<p>Add yet another dimension, and\u00a0you can still find the &#8220;hyperplane&#8221; which\u00a0will maximize the variance\u00a0along a\u00a0new basis in that multidimensional space. But you will not be able to imagine it.<\/p>\n<p>Because principal components are mathematical artifacts\u00a0\u2014\u00a0no one <em>begins<\/em> by measuring an imaginary\u00a0combination of variables \u2014\u00a0they\u00a0must be interpreted. In this\u00a0admittedly contrived example from Shakespeare, the\u00a0imaginary projection of our existing data onto\u00a0the first principal component, PC1, happens to connect meaningfully with one of the sources of variation\u00a0we already look for in\u00a0cultural systems: genre. A corpus of many more\u00a0plays, covering a longer period of time and more authors, could become\u00a0the basis for still\u00a0more\u00a0projections\u00a0that would call attention to other dynamics we want to\u00a0study, for example, authorship, period style, social coterie or inter-company theatrical rivalry.<\/p>\n<p>I\u00a0end by emphasizing the interpretability of principal components because we humanists\u00a0may be tempted to see them as something other than mathematical artifacts, which is to say, something other than principled creations of the imagination. Given the data and the goal of maximizing variance through projection, many people could come up with the same results that I have produced\u00a0here. But there will always be a question about what to call\u00a0the &#8220;underlying dynamic&#8221; a given\u00a0principal component is supposed to capture, or even about whether a component corresponds to something meaningful in the data. The ongoing work of interpretation, beginning with the task of\u00a0naming what a principal component is capturing, is not going to disappear just because we have learned to work with mathematical \u2014 as opposed to literary critical \u2014 tools and terms.<\/p>\n<div id= \"Axes\">\n<p><span style=\"text-decoration: underline;\">Axes,\u00a0Critical\u00a0Terms, and\u00a0Motivated Fictions<\/span><\/p>\n<p>Let us return to the idea that a mathematical change of basis might call our attention to an underlying dynamic in a &#8220;system&#8221; of texts. If, per Shlens&#8217;s analogy,\u00a0PCA works by finding the ideal angle from which to view\u00a0the oscillations of the spring, it does so by finding a better <em>proxy<\/em> for the underlying phenomenon. PCA doesn&#8217;t give you the spring, it gives you a better angle from which to view the\u00a0spring. There is nothing about the spring analogy or about PCA that contradicts the possibility that the system being analyzed could be <em>much<\/em> more complicated \u2014\u00a0could contain many more dynamics.\u00a0Indeed, there nothing to stop a dimension reduction technique like PCA from finding dynamics that we\u00a0will never be able to observe or name.<\/p>\n<p>Part of what the humanities do is cultivate empathy\u00a0and a lively situational imagination, encouraging us to ask, &#8220;What would it be like\u00a0to be this kind of person in this kind of situation?&#8221; That&#8217;s often how we find our way into plays, how we discover &#8220;where the system&#8217;s energy is.&#8221; But the humanities is also a field of inquiry. The\u00a0enterprise\u00a0advances\u00a0every time someone refines one of\u00a0our explanatory concepts and critical terms, terms such as &#8220;genre,&#8221; &#8220;period,&#8221; &#8220;style,&#8221; &#8220;reception,&#8221; or &#8220;mode of production.&#8221;<\/p>\n<p>We\u00a0might think of these critical terms as <em>the humanities equivalent of<\/em> <em>a<\/em>\u00a0<em>mathematical basis<\/em> on which multidimensional data are projected.\u00a0Saying that Shakespeare wrote &#8220;tragedies&#8221;\u00a0reorients the data and projects a host of small observations\u00a0on a new &#8220;axis,&#8221; as it were, an axis that\u00a0somehow summarizes and so clarifies a much more complex set of comparisons and variances than we could ever state economically. Like geometric axes, critical terms such as\u00a0&#8220;tragedy&#8221;\u00a0<em>bind<\/em> observations\u00a0and offer new ways of assessing similarity and difference. They also force us to leave things behind.<\/p>\n<p>The analogy between a mathematical change of basis and the application of critical\u00a0terms might even help explain what we do to\u00a0our colleagues in the\u00a0natural and data sciences.\u00a0Like someone using a\u00a0transformation matrix to\u00a0re-project data, the humanist\u00a0introduces powerful critical terms in order to\u00a0shift observation, drawing\u00a0some of the things\u00a0we study closer together while\u00a0pushing others further apart. Such a\u00a0transformation or change of basis can be accomplished in\u00a0natural language with the aid of field-structuring\u00a0analogies or critical examples. Think of the perspective opened up\u00a0by Clifford Geertz&#8217;s notion of &#8220;deep play,&#8221; or his example of the Balinese cock fight, for example. We are also adept at making\u00a0comparisons that turn examples\u00a0into the bases of new critical taxonomies. Consider how the\u00a0following sentence reorients a humanist\u00a0problem space: &#8220;<em>Hamlet<\/em>\u00a0refines certain\u00a0tragic elements in\u00a0<em>The Spanish Tragedy<\/em>\u00a0and thus becomes a representative example of the genre.&#8221;<\/p>\n<p>For centuries, humanists have done these things\u00a0<em>without<\/em> the aid of linear algebra, even if\u00a0matrix multiplication and orthogonal projection now produce parallel results. In each case, the researcher seeks to replace\u00a0what Shlens calls a &#8220;naive basis&#8221; with a <em>motivated<\/em> one, a\u00a0projection that maps distances in\u00a0a new and powerful way.<\/p>\n<p>Consider, as a final case study in\u00a0projection,\u00a0the famous speech\u00a0of Shakespeare&#8217;s Jacques, who begins his\u00a0<a href=\"http:\/\/www.folgerdigitaltexts.org\/?chapter=5&amp;play=AYL&amp;loc=line-2.7.146\" target=\"_blank\">Seven Ages of Man speech<\/a> with the following\u00a0orienting move:\u00a0&#8220;All the world&#8217;s a stage, \/ And all the men and women merely players.&#8221; With this analogy, Jacques\u00a0calls attention to a\u00a0key dynamic of the\u00a0social system that makes\u00a0Shakespeare&#8217;s profession possible \u2014 the fact of pervasive play. Once he has provided that\u00a0frame, the\u00a0ordered list of life\u00a0roles falls neatly into place.<\/p>\n<p>This ability to frame an analogy or find an orienting concept \u2014the world is\u00a0a\u00a0stage, comedy is a pastoral retreat, tragedy is a fall from a great height, nature is a book\u00a0\u2014 is\u00a0something fundamental to\u00a0humanities thinking, yet it\u00a0is necessary for all\u00a0kinds of inquiry. Improvising on a theme from Giambattista Vico, the intellectual historian Hans Blumenberg made this point in his work on foundational analogies\u00a0that inspire\u00a0conceptual\u00a0systems, for example\u00a0the Stoic theater of the universe or the serene Lucretian spectator looking out on a\u00a0disaster at sea. In a number of powerful\u00a0studies \u2014\u00a0<a href=\"https:\/\/mitpress.mit.edu\/books\/shipwreck-spectator\">Shipwreck with Spectator<\/a>,\u00a0<a href=\"http:\/\/www.amazon.com\/Paradigms-Metaphorology-Signale-Letters-Cultures\/dp\/0801449251\">Paradigms for a Metaphorology<\/a>, <a href=\"http:\/\/www.sup.org\/books\/title\/?id=935\" target=\"_blank\">Care Crosses the River<\/a>\u00a0\u2014 Blumenberg shows how analogies such as these\u00a0come to\u00a0define entire intellectual systems; they even open\u00a0those systems to sudden reorientation.<\/p>\n<p>We certainly need to think more about why mathematics might allow us to appreciate unseen dynamics in social systems, and how critical terms in the humanities allow us to communicate more deliberately about our experiences. How startling\u00a0that two very different\u00a0kinds of fiction \u2014 a formal conceit\u00a0of calculation and the enabling, partial slant of\u00a0analogy\u00a0\u2014 help us find our way among the things we study. Perhaps this should not be surprising. As artifacts, texts and other cultural forms are\u00a0<a href=\"http:\/\/winedarksea.org\/?p=926\">staggeringly complex<\/a>.<\/p>\n<p>I am\u00a0confident that humanists will continue to seek\u00a0alternative\u00a0views on\u00a0the complexity of what we\u00a0study. I am equally confident that our encounters with\u00a0that complexity will remain partial. By nature, analogies and computational artifacts obscure some things in order to\u00a0reveal other things:\u00a0the motivation of each is\u00a0expressed in\u00a0such tradeoffs.\u00a0And if\u00a0there\u00a0is no unmotivated view on the data, the true dynamics of the cultural systems we study will always withdraw, somewhat,\u00a0from\u00a0the lamplight\u00a0of our\u00a0descriptive\u00a0fictions.<\/p><\/div>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>It&#8217;s hard to conceive of distance measured in\u00a0anything other than a straight line. The biplot below, for example, shows the scores of Shakespeare&#8217;s plays on\u00a0the two Docuscope LATs discussed in the previous post, FirstPerson and AbstractConcepts: Plotting\u00a0the items in\u00a0two\u00a0dimensions gives the viewer some general sense of the shape of the data. &#8220;There are more items [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[24,8],"tags":[194,197,149,192,195,184,196,193],"class_list":["post-2271","post","type-post","status-publish","format-standard","hentry","category-quant-theory","category-shakespeare","tag-change-of-basis","tag-hans-blumenberg","tag-humanities","tag-jonathon-shlens","tag-literary-concepts","tag-pca","tag-principal-component-analysis","tag-projection"],"_links":{"self":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts\/2271","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=2271"}],"version-history":[{"count":88,"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts\/2271\/revisions"}],"predecessor-version":[{"id":2617,"href":"https:\/\/winedarksea.org\/index.php?rest_route=\/wp\/v2\/posts\/2271\/revisions\/2617"}],"wp:attachment":[{"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=2271"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=2271"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/winedarksea.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=2271"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}