Tag: genre

  • Penalty Kicks and Distributed Movement

    Gabriel Dias, graduate student at RPI, has recently modeled the way in which penalty kickers move their bodies as they prepare for a shoot. His findings suggest that there are several “tells” – for example, the angle of the hips, or the position of the planted foot – which predict the ultimate direction of the shot. In the PBS interview that I’ve linked to above, he alludes to the existence of “distributed” movements which show the physical commitment of the kicker to one outcome or another. I hear the word distributed and I immediately think, “integrated physical system,” like a body that is constrained to do certain things because of the way its different parts interact. We see this integration in the competitive world of athletics and the expressive realm of dance. (Perhaps the adjectives here should be reversed?)

    In our analysis of texts, we have also find distributed movements of a sort. We find, that is, that certain types of words tend to move with each other in some genres, and others move away from one another. Does this mean that genre is a physical system like penalty kicking, and that our explanation of these distributed movements – of words rather than points on a body – are themselves grounded in a physical reality? I have myself offered analogies to describe this “commitment of weight” in the process of using words to do certain things: if you want to write a Shakespearean comedy, there are certain things you are likely to do: you will tend to use more first and second person singular pronouns and less description than you would in, say, a history play. If Docuscope is the goalie/keeper, it may need only 30 or 40 lines to decide that the ball is going to go toward comedy rather than history. Other things will be ignored as incidental. If say that tagging a play and watching how its “points” move in a mathematical space is like biotagging a kicker and studying his or her movements, I am proposing an analogy. Like kicking, writing is a behavior. In certain situations (penalty kicking, writing for the stage), some aspects of this behavior are signal or cardinal — position of hips, use of pronouns –while others are inessential, like the curve of the kicker’s index finger. (Actually, given the dynamism of the human body, I would be surprised to find out that there is not, on some level, a connection between finger position and kick.)

    So, does this mean I am advocating an essentially structuralist account of genre? Am I saying that, because language use is a behavior, then writing in a particular genre is also a behavior with certain “tells” that are, in a sense, built into the physical system of writing? I think people who are doing iterative criticism need to have an intelligent answer to this question, complete with an analysis of its underlying analogy. My answer would be that writing fiction in a historically bound literary field does, like penalty kicking, count as a behavior and that such behaviors will exhibit coordination. There is as much connective tissue in language, grammar, plot and audience expectation as there is in the fabric of the human body. But this is not the same thing as saying that there is an essential structure to particular types of writing – that the existence of a tell implies an underlying recipe, essence or structure that is genetically dictating the behavior of the writer.

    Why doesn’t structuralism follow from linguistic integration? First, writing is not like penalty kicking. Dias chose penalty kicking because it is a binary physical outcome. With respect to the standing keeper, the ball goes left or right. Language, on the other hand, is like a flock of birds: it can break any way, 360 degrees, and is doing so dynamically at all times. “Yet the flock shows direction,” you say. “Individual birds may be wobbling left and right, up or down, but there is a recognizable trajectory within the group.” Perhaps there are deterministic ways of saying where this group is going to go next, but I doubt it. The total behavior is distributed, immanent: it has massive integrity as an aggregate, but the existence of that integrity does not imply some non-negotiable locus of control. Another way of saying this, and now I am channeling Whitehead, is to say that the direction of the flock is a continuously unfolding event or “society” of actual occasions. Thus, the penalty kicking example is good for showing entailment and distributed connection in the elements of literary linguistic analysis, but bad as a model for the errant and multiple trajectories of writing.

    The existence of the tell essentially pushes back the timeline of intelligibility of the direction of the ball. A good keeper or student of physiology – like a good literary critic – will know earlier than most what kind of behavior is being exhibited. But unlike a keeper in a football game, the critic is not looking for a binary outcome. Rather, the critic or spectator is comparing the unfolding action onstage to any number of possible theatrical “types” of entertainment and generic conventions. Shakespeare takes five penalty shots at a time, all the time. If you are interested in this aspect of the play – its participation in comic conventions – yes, there will be “signal” or orienting linguistic events at the level of the line which you could consult to predict what he is about to do. But you don’t have to consult the tells and this is not a penalty kick: you already know what is going on and, indeed, are a better judge of the texture and generic tonalities of the play as it unfolds than a keeper who has to wait for the ball to be kicked. (Docuscope really is a keeper; it knows nothing until the event happens.) As we have seen in our research, human beings are massively sensitive to variations and distributed cues in linguistic behavior. We make an astonishing number of connections between the kinds of variation we see among the plays and texts we have encountered. Finding out that there is a linguistic “tell” for comedy doesn’t then mean that comedy essentially or structurally “is” the series of tells we reliably find for it. The “tells” here are a parallel description — and this, after the fact — of a perceptual reality that we render qualitatively and immediately, in our feel for certain types of writing or stories.

    I have used the words “signal,” “cardinal” and “orienting” to describe the types of tokens that serve as good landmarks for genre in this alternative descriptive universe. I do not use “essential.” As we work further through this analogy between physical and linguistic behaviors, I think we should adopt Spinoza’s metaphysical position from the Ethics, that there is a parallelism between the twin domains of thought and extended physical beings. Neither has priority. When understood as a species of behavior, theatrical writing or literary production must obviously exhibit certain empirical regularities: it takes place on the fleshy platform of human consciousness and is constrained by the physical limits of our bodies, environment and history. As critic, I would want to insist that no material factor – the practices and limitations of stagecraft, the documented or remembered history of past performances, the politically charged distribution of resources and cultural actors – can be a priori excluded as unexpressable in the behavior that is writing. All constraints are summed and expressed, but in different amounts. But I would also want to insist that– whatever the behavior is that we are tracking – there has to be in place a certain set of agreements to make sense of the “movements” in this system as such. I have to want to count “these types” of words and not those. I have to search for significant coordination of these counted things with respect to “this type of outcome” and not another. Someone has to have the desire to study penalty kicks, for example, or authorship, or genre: behaviors don’t simply want to study themselves.

    The tell is a “sign” that speaks for the kicker, and speaks early. It is a signal event worth attending to if you are a keeper. It is simultaneously an element in a causal sequence, constrained by events prior to it, and a negotiable sign or expression of an intention to do something. It is a physical way of saying, “I mean to kick the ball this way.” The point of the parallelism is that you never get to dump one half of the phenomenon. Leaning to the left, we acknowledge: all physical tells may be redescribed as expressions of an intention, and so tokens of meaning. But inclining to the right, we say: all tokens of meaning are, on some level, also indexes of empirical constraints. The keeper has to dive both ways.

  • Presentation at London Forum for Authorship Studies/Digital Text and Scholarship Seminar

    Jonathan Hope and I presented here in London on a trip arranged by Brian Vickers and Willard McCarty. It was a lovely occasion held in Senate House, attended by some we knew and others we got to know. We began by rolling out paper copies — six feet long scrolls! — of the very large diagram that you saw in the last post. One of the things we have begun to discuss is the ways in which different forces seem to be expressed on various twigs of this dendrogram illustrating relationships among 318 early modern plays. On some twigs, everything that is being grouped together has a common author. On others, the situation is not so clear. Why, for example, aren’t there large groupings of texts written at the same time? (There are some smaller clusters of these.) The principle at work here, when texts are matched in terms of their distance scores on all of Docuscope’s available features (LATs), is that every type of difference present in the population being studies will be expressed in the result. The difficulty is disentangling which type of difference — generational, authorial, generic, company, etc. — is at work in a give grouping.

    One thing we spent some time discussing yesterday was three clusters in which Jonson’s plays appear. Here they are below:

    All of Jonson’s masques are clustered at the bottom of the diagram (except Cynthia’s Revels, which is clustered in the middle). These are possibly the most distinct items in the entire corpus we are currently working with. Notice how far right the cluster extends before joining with the rest of the diagram: this indicates its dissimilarity with other clusters. But notice too that, within this cluster (as Jonathan pointed out yesterday), there is also a lot of variation. Not only are Jonson’s masques very different from the rest of Renaissance drama (including several interludes), but they are quite different from one another. It’s like a galaxy that is far away from all of the others, but whose stars are themselves quite spread out.

    So, what about the other two clusters? We decided to profile all three and came up with some interesting findings. First, the masques. After performing PCA and then rating the clusters on the different components, we found several that were quite good at isolating the items on particular twigs. (This is not a scientific procedure, but it is our first attempt.) With the masques, we found that the language is high in StandardsPositive, StandardsNegative, and ReportingStates. Here’s an exemplary passage, with both StandardsPositive and StandardsNegative in green, and Reporting States in purple:

    Masques describe what you are seeing or have just seen in a comparatively static fashion, hence the reporting states. As Brian Vickers pointed out in the question period, the genre of encomium deals with praise and blame, which are the words that are being picked up in the positive and negative standards.

    Compare this, now, to the profile of some of Jonson’s other comedies: Poetaster, Volpone, and the other items in the top group. These items are characterized by OralElement (yellow), Question (blue), Intensity (orange), and Person Property (purple):

    Here we see a pattern we also saw in Shakespearean comedy: a lot of items associated with one to one interaction. The OralElement here marks the bustle of persons whose social function is marked (PersonProperty) and who are mixing in a state where contact must be established or maintained. Some of the satirical force of the scene is bundled into the intensity strings, which show the emphatic nature of certain social performances that are mannered and so open to mockery. We noticed these intensity strings in Middleton as well, which makes us suspect that a combination of PersonProperty strings and intensity might be a feature of City Comedy. Something to check out in the future.

    What makes this top cluster different from the second? Different author? No. Different genre? Not really, at least, not according to the ones we recognize critically. And note too that there are multiple authors on this middle cluster: Chapman, Jonson and Fletcher. Perhaps we should be thinking in terms of modes instead of genres: is there a different mode of storytelling, dramaturgy, or conducting comic business here? When we use PCA to characterize this cluster and compare the results with those that characters the one at top, we find similarity and difference. What’s similar is the OralElement (yellow), Question (blue), and PersonProperty strings (purple):

    But we now see strings associated with TimeShift (scarlet), which indicate that a person is marking the difference between two temporal frames (then/now, now/future), and here seems to be associated with figuring out what someone might do or bring about in the present or near future. Here they are anticipatory, looking at what is to come from the standpoint of the present. (In Shakespeare’s late plays, by contrast, we found that action from the past is frequently narrated from the standpoint of the present.) The other thing that is different in this cluster is something that we would never see, because it is not there. The plays in this cluster lack something:

    These purple strings, which are classed as ReportingStates. They are tokens that occur frequently in this text — look at how many of them are in this play, which is from the second cluster — but as a whole the plays in this group lack these strings with respect to the larger population of early modern drama (whereas the top group did not). This kind of relative difference between generally quite frequent items is one that you could probably only grasp with the aid of statistics. We hypothesize that these strings are allowing the actors to report action that has taken place offstage in the past, keeping attention focused on the present which is hurtling forward in time. Should this be its own subgenre of Jonson that includes Fletcher and Chapman? Would it be worth naming a grouping like this? Another question for further study.

    We received some terrific comments and questions. To our comment that the first Principal Component for this population does seem to track a broad and evolving temporal shift (plays score lower on the component as time goes on), Richard Proudfoot asked if there was more variation in the very early plays in our collection. This is indeed the case, and he followed with the point that we have an uncertain grip on this earlier population because little of it survives. Other explanations for wider variation in the pre-1590 items: English as a language is more fluid prior to 1600, as Jonathan pointed out. It may also be the case that the genre system itself has not stabilized because the professional theater is still gaining its footing in London.

    Erica Fudge asked another interesting question: some of the comic strings associated with interaction and comedy (we showed our Shakespeare comedy results) reminded her of the writing in Montaigne. What, she asked, is the relationship between skepticism and comedy, and would we be interested in tracing the presence of something like a skeptical inclination across prose writing and drama. This is a very good question. I would hope that we could study, with these techniques, something like the “sentence level intellectual culture” of the period, one that extends across genres like drama and the essay. Like most of our presentations, we left with more questions and ideas about future experiments. This work seems to us to be provisional in a way that other humanities research is not. You get an idea, talk about it with others, try it, and then decide to try something else. Academic papers at humanities conferences, on the other hand, usually present findings with an air of categorical certainty. And yet, we know that when human beings are involved, all findings are provisional. Odd.

  • Docuscope Goes Live on Shakespeare Quarterly Open Peer Review

    Jonathan Hope and I have written a new piece that we submitted to the special issue of Shakespeare Quarterly on “Shakespeare and New Media.” The essay cleared the first stage of editorial review, and is now posted at MediaCommons for general comment and critique prior to final editorial evaluation. Please visit the essay here and make your views known. The abstract and title are as follows:

    “The Hundredth Psalm to the Tune of ‘Green Sleeves’”: Digital Approaches Shakespeare’s Language of Genre
    In this essay, we explore the underlying linguistic matrix of Shakespeare’s dramatic genres using multivariate statistics and a text tagging device known as Docuscope, a hand-curated corpus of several million English words (and strings of words) that have been sorted into grammatical, semantic and rhetorical categories. Taking Heminges and Condell’s designations of the Folio plays as comedies, histories and tragedies as our starting point, we offer a portrait of Shakespearean genre at the level of the sentence, showing how an identification of frequently iterated combinations of words (either in their presence or absence) can allow us to appreciate the integrity and fluidity of Shakespeare’s genres in new ways. Calling this approach “iterative criticism,” we situate our critical practice in the context of both Shakespearean criticism and more general protocols of reading in the humanities, concluding with a genre map of Shakespeare’s plays in the context of 282 other early modern plays.

    As the last line suggests, we have now managed–with the help of Martin Mueller at Northwestern–to produce an analysis of 282 plays from the TCP database alongside the Moby Shakespeare written between 1519 and 1659. I think this is the first visualization of its kind purporting to treat 150 years with of Renaissance drama, which itself feels like something of a hurdle overcome. Here it is:

    Dendrogram Produced using Ward’s clustering method on scaled data using 99 LATs to profile 318 plays written between 1519-1659, color coded by genre and separating out the works of Shakespeare as a category of their own: Red=Comedy, Blue=Interlude, Green=History, Cyan=Tragedy, Purple=Tragicomedy, Orange=Masque, Gold=Shakespeare. The item names follow the protocol: (genre)-(date)-(author)-(title).

    Two points to make here, although there could be many more. First, this diagram was constructed using scaled data, which means that the “mile away” linguistic markers of similarity and dissimilarity are being balanced with markers whose variation is less visible from a distance. Variables with large standard deviations are not dominating with respect to those with smaller ones. Note then that most of Shakespeare’s works cluster together here, comedies, tragedies and late plays all on the same twig. When I tried this analysis using non-scaled data, these genres split up and Shakespeare’s comedies clustered together with Jonson’s, suggesting that Ward’s clustering procedure on unscaled data is better for picking up genre differences, while the same procedure conducted on scaled data (as is the case here) is more sensitive to authorship. (For an earlier analysis of Shakespeare’s plays only using scaled data with Ward’s clustering technique, see this.) This finding should be tested in other contexts and with other data sets, but it is interesting, since it suggests that authorship becomes legible when fluctuations in variables that contain lots of tokens (say, Description) are coordinated with those that have many fewer tokens. It may be this “adding a dash of something” that pulls the author as such to the fore in an analysis.

    I’d like also to offer another observation here about the fact that so many Shakespeare plays are hanging together (as are Shirley’s and Middleton’s), remaining agnostic for the time being about whether it is authorship or genre that is producing these clusterings. The majority of Shakespeare’s plays are clustering on a twig that contains mostly comedies. So when compared with 282 other items written between 1519-1659, Shakespeare’s plays look for the most part like plays that Harbage (in the Annals of English Drama) classed as comedies as opposed to some other genre. (Martin tells me that he followed Harbage for the most part, but made some guesses himself about genre designations based on title page information and common sense.) The thing to remember here is that an individual genre may cluster in different ways depending upon the larger population in which it is situated. That is, a fuller collection of texts from the period–not just the ones that Martin was able to modernize so that we could run a test on them–might show new subdivisions that end up splitting the Shakespeare block into a number of smaller splinters. (Or it may not: this may be a stabilized portrait, more or less.) The best way to understand more about the groupings themselves is to begin looking at them with the help of PCA and other techniques we’ve been using already. That’s where we’re headed next.

  • Rhythm Quants: Burial, Click Tracks, Genre Tempo

    Graham has posted a new video by one of my favorite artists over at Object Oriented Philosophy. Burial is a London DJ whose work often gets filed under the label “dubstep,” a variety of post-house electronica that appeared several years ago. I like dubstep a lot, and this video actually captures something of its unsteady, city-worn appeal:

    One of the greatest things about Burial is that his beats are asymmetrical. That is, in a world where you can loop beats in such a way that the “ictus” (ideal musical point where the beat falls)  is evenly distributed across the entire snippet, Burial’s beats sway a bit from tempo and then rejoin when the loop starts over again. I tend to hear this because I am a drummer, and was trained to play in the 1980s, just when drum machines were becoming more common in live performance and studio recording. For drummers who learned to play in this period, we were forced to synch our bodies (and eventually, minds) to a mathematically precise representation of the ictus — one that is produced by a machine — so that our own playing would match up with that of others who were similarly keyed into this “reference beat.” Most often, that reference beat would be calling the changes in synthesizer parts (which were electronically triggered by that reference): so the whole band, or the band in the recording studio, would ideally be vibrating to the same periodic oscillation, one that never changed unless the beat frequency was altered by the programmer or producer.

    But of course, drumming is more fluid than this kind of matching to the mathematical ictus. Most dance music — music that people actually dance to — has subtle movements ahead of and behind the beat. This occurs in part to create musical tension, but also to whip dancers around in the right way. (Our bodies may exhibit symmetry, but our dance steps do not.) The most extreme versions of this kind of dance-wobble that I have witnessed, although not directly related to drumming, occur in European music. Hearing an orchestra play Strauss in Vienna, I was initiated into something that the Viennese take for granted: Strauss rushes the 1-2 in the 1-2-3 of waltz tempo, which means that you get a one-two…three, one-two….three in which the second beat does not evenly divide the first from the third. Hungarian and Romanian folk music has some of this as well. I remember being at a dancehouse in Budapest in the eighties and hearing a Roma folk band play, and was amazed at the quick surges and retards in the tempo, occurring at every measure. This variation, I was told, helped the dancers whip each other around so that their bodies could lean at the appropriate moment: a really beautiful idea, since it suggests that the music itself was conforming to the movements and weightings of the dance — even at the level of tempo.

    If you look at the beginning of the Burial video, you can see the idea of symmetry taken apart on the screen, as the diagonals display action in a kind of dance-box. Movements and pans in and out of the paired boxes does not occur at the same speed, which means that you get the same kind of staggered synchrony that often occurs in Burial’s musical beats, but here it occurs visually.

    I suspect that a good studio engineer could actually quantify the ways in which Burial’s beats redistribute the ictus on a measure by measure basis, something that was once done by drummers who were not playing to a “click” or mechanically measured metronome, but perhaps more intuitively and communally. That’s not to say that Burial has recaptured the “fluid” nature of the beat or that the electronic metronome killed the beat (and that Burial is bringing it back). It’s not that simple. Rather, drummers have always had a good sense of what the “ictus” is and have manipulated it implicitly by speeding up and slowing down before the beginnings and endings of measures. In a pre-click track world — listen, for example, to some of the beats by The Meters — you wouldn’t necessarily notice the manipulations, because the world has not yet learned to “hear” the absent click, which happens once music everywhere is keyed to an inaudible metrical yardstick. I would say that this was the case by the early ninetees. But once this implicit beat becomes part of the music — part of the bodies and ears of drummers and listeners alike — the tempo pushes and pulls are audible as deliberate. The drummers Manu Katché and Omar Hakim have made an art form out of this over the last two decades. I’m sure both of them can play to a click track (or not) in their sleep.

    The point here is that human beings are exquisitely sensitive to quantitative phenomena like rhythm, and they can also have their background perceptions of what “proper rhythm” is shaped by the music they encounter. There is a backbeat or hidden track to music that is cultural, but that is confirmed or shifted with each performance. I suspect genre works in the same way — as a set of constantly shaped expectations — and that in some cases tempo has been keyed to certain arbitrary or regular standards in order to create particular effects. Serialization might be one version of this (something my colleague Susan Bernstein and I are working on), or the partitioning of plot around commercials.

  • The Musical Mood of the Country

    This morning the New York Times published a story today about a group of mathematicians who are counting types of words in popular songs in order to get a handle on something like the mood of the country. In trying to data-mine mood, they do what all people who count things do: move from something that you can quantify empirically to something that you can’t. We do this as well when we move from “types of words” or Docuscope strings in Shakespeare plays to “genre.” The strings are empirically countable — they are either there in an established corpus or they aren’t — but one must argue for any connection between what is counted and what such counts represent (genre, mood, etc.). The point I have tried to make on this blog is that the connection is interpretive, and so relies on the hermeneutic skills of the one proposing the link.

    In the abstract for the paper, recently published in the Journal of Happiness Studies, they write that: “Among a number of observations, we find that the happiness of song lyrics trends downward from the 1960s to the mid 1990s while remaining stable within genres, and that the happiness of blogs has steadily increased from 2005 to 2009, exhibiting a striking rise and fall with blogger age and distance from the Earth’s equator.” This is an interesting finding, particularly the part about blogger age and distance from the equator. One of the selling-points of their analysis is that the data they have obtained is voluntarily supplied, and so perhaps less subject to the social pressures that accompany surveying. I would want to know, on this score, whether a song-title (for example) is subject to other types of pressures. For example, the songwriter is not just “reporting” an inner state by naming a song in a particular way — take the Ramones song, “I Wanna Be Sedated” for example — but offering this title to an audience. Song-names are rhetorical, and so subject to a different set of pressures than “reporting.” There is another kind of self-interference here that doesn’t seem to be taken into account.

    One of the lead researchers on the paper, Peter Sheridan Dodds, argues that data supplied voluntarily on the web can serve as a kind of “remote sensor of well-being.” (I remember hearing similar arguments made about baby names a while back; you don’t have to pay for them and they’re important: therefore they are a good measure of national feeling and trends.) For example, teenagers appear to be the least happy because they more frequently use words such as “sick,” “hate” and “stupid.” Wouldn’t it be more interesting to track how the use of these words (or absence of them) compares to groups of populations that teenagers themselves describe as “unhappy?” My inclination here would be to use data-mining techniques to assay and re-describe classifications made by a given social group in terms that they may not necessarily be aware of. Then the factual claim would be: when teenagers describe someone as happy, that person is x% less likely to use words like “sick,” “hate” and “stupid.”

    I can imagine the authors of the Music-Mood study making the following set of claims:

    Claim 1) Research on web-logs, lyrics and other sources of expression show that words like “sick,” “hate” and “stupid” occur more frequently in a representative group of works by teenagers. This would be the empirical claim.

    Claim 2) People who are experiencing a mood such as “well-being” are less likely to mention words like “sick,” “hate,” and “stupid” in unprompted work such as songwriting or blogging. This is an interpretive claim that must be argued for.

    Claim 3) Teenagers are less likely than others to be experiencing a mood of well-being. This is logically true if you accept 1 and 2.

    Now, what’s interesting about 2 — the interpretive claim — is that it could be made without numbers. In a sense, you either believe this or you don’t. Which begs the question, what exactly are the numerical claims doing in this argument? What if claim 2 is “kind of true,” or “true only among certain people”? Would this mean that “kind of a lot” of teenagers are unhappy?

    I would be more comfortable saying that teenagers use more of the following words (“hate,” “stupid”), and that a close look at the contexts in which they use them (which can never be comprehensive) suggests that their use is connected to mood in the following way (e.g., their use allows teenagers to gain social attention by citing negative emotions, their use indicates depression, their use indexes the presence of Goth subculture, etc.). But I would want to know how the words are used rather than simply making inferences from the fact that they occur. The counter-argument here is that the law of large numbers guarantees that even if there is a wide variation of uses of the words (granting, in effect, that not all occurrences are “reports” of mood), there is nevertheless a broad enough pattern to make a generalization. Fair enough, but what numbers are you going to use to make the generalization?

    I’m all for the empirical investigation of abstract concepts like happiness, genre, authorial intent. These higher order concepts don’t come from outer space: we create them to capture some suite of characteristics we find in reality or in ourselves. But the Music-Mood analysis lacks a crucial ingredient: an explicit human judgment about the classes that are being measured by the tokens that are being counted. Unless you make that judgment explicit — saying something like “x% of people who experience what persons y and z would describe as ‘well-being’ also produce unprompted work containing these words — you are really just saying that “a lot” of people who we think are happy do this.

    Naming something with a word is a way of creating a class of things (as long as that word is not a proper name), and it is classes of things that are correlated quantitatively using statistics: quantities of classes of words in classes of works, for example. In any such analysis, the classes themselves cannot be derived empirically. They have to be specified in advance by appealing to experience, common sense, expertise, or the like. What troubles me about the Musical Mood analysis here is that the rationale for membership in the class of words indicating “well-being” is not spelled out, and perhaps never could be. I would rather ask someone — an expert? a teenager? — to name people who experience well-being and then do one of Matt Jockers’ most-frequent-word analyses on their lyrics or blogs in order to get at the underlying pattern. It’s fine to begin with a set of words whose occurrence indicates (to you) a feeling 0f well-being, but without knowing quantitatively how indicative they are, the numbers are just another kind of adjective. You might as well read a bunch or web pages and decide for yourself. 

    My guess is that you would conclude that teenagers write like teenagers rather quickly.