Visualizing English Print is currently working with a new corpus of Big Name scientific texts. The corpus contains 329 texts by 100 authors, drawn from EEBO-TCP and covering the period 1530-1724. These Big Name authors were selected on the basis of their prominence as early modern writers who address scientific subjects. The process of selecting Big Name authors involved searching for the most well-known and influential figures of the period (e.g. Francis Bacon, Robert Boyle, Descartes), followed by a search for key scientific terms in the metadata of the EEBO-TCP csv file (e.g. ‘Physics’, ‘Astronomy’, ‘Atoms’, ‘Matter’).
What types of texts constitute ‘scientific texts’?
Since the ‘genres’ of early modern science were diverse (and the disciplinary boundaries rather fluid) the parameters of the corpus had to reflect this diversity. To this end, the corpus is divided into scientific subgenres (detailed below.) Because of the ways in which these genres intersect, texts are assigned to subgenres based on the prominence of a particular feature of the work. For example, there are generic crossovers with texts on Mathematics, Astronomy and Instruments – Astronomy relies on geometry, and there are a number of mathematical instruments. If the texts appear to foreground Astronomy or Instruments they are assigned to the relevant groupings. This approach is observed with every subgenre.
With the finalised corpus we have been running some preliminary PCA experiments to see if any interesting patterns emerge. The following PCA visualizations provide a general overview of the data, and some first impressions of how the scientific subgenres are patterning. The diagrams below were produced using JMP.
Overview (Click to enlarge images)
This is a PCA visualization of the complete data set of the 329 texts. LATs with frequent zero values have been excluded.
Subgenres
Here is a visualization with the subgenres highlighted:
Astronomy = Blue
Mathematics = Dark Green
Instruments = Red
Physics = Lilac
Philosophy of Science = Green
Science/Religion = Lime Green
Natural History = Purple
Occultism = Orange
Medicine- Anatomy = Brown
Medicine-Disorders = Green/Brown
Medicine-Treatments = Light Blue
There is a lot of information to process in this image, but if we isolate and compare specific subgenres, clearer patterns begin to emerge. For example, this image maps Astronomy:
And this image shows the related subgenres of Mathematics (Green) and Instruments (Red):
Mathematics and Instruments appear to be grouping in the upper left of the PCA space. The LATs associated with this space include Abstract Concepts, Space Relation and MoveBody – types of language that deal with the special terminology of abstraction, and with bodies extended and moving in space. Such a result would seem to confirm what we might expect of Mathematics – a genre concerned with representing the physical world and physical processes through abstraction. Astronomy is interesting in that, while it appears to share in the traits of abstraction and space, it is also drawn into the other three quadrants. Why might this be the case?
A possible reason may be that people studied the stars and planets, not just from a mathematical perspective, but from an imaginative one; astronomy is bound up with mythology and astrology. In the visualization above, one of the Astronomical outliers is marked with a triangle in the lower right of the PCA space – an area that contains LATs, such as Positivity, Person Property and Personal Pronoun. The text in question is Robert Greene’s Planetomachia (1585), which blends classical mythology, religion and astrology in the form of a dramatic dialogue between the planets. The text has a high frequency of Abstract Concepts and Sense Objects; but it is drawn into this lower right quadrant by LATs, such as Personal Pronouns, Person Property (formal titles, identity roles), Subjective Perception, and positive and negative language:
What we are seeing here is a division in the PCA space between broadly imaginative and instructional modes of writing.
Medicine
This visualisation displays the three subgenres of Medicine: Medicine-Anatomy (Green), Medicine-Disorders (Red) and Medicine-Treatments (Blue):
The majority of the Medicine subgenres are patterning across the lower half of the PCA space. The LATs of the right quadrant include Support (i.e. justify an argument), Responsibility (i.e. answerability for a certain state of affairs) and Reassure (words of comfort) – all of which we can imagine in a medical context. On the left, we have the LATs, Recurrence (over time), Imperative and Reporting Event. Medicine-Treatments (Blue) appears to be inclining towards the left; the reason for this may be the high frequency of Reporting Events found in these texts. The Docuscope definition of Reporting Events is: ‘learning about events that may not be known yet or that can lead to the learning of new information’. But it reads very much like imperative language, a recipe, or the instructions and recommendations of a Doctor. See, for example, this extract from Sir Kenelm Digby’s Choice and experimented receipts in physick and chirurgery (1675):
Philosophy of Science & Science/Religion
Philosophy of Science and Science/Religion are also subgenres that share common features/generic crossovers. In these subgenres we find a number of the famous early modern scientists (or scientific theorists), who are concerned with questions of methodology, morality and science as a system of knowledge – figures like Francis Bacon, Rene Descartes and Margaret Cavendish. Here is a visualization of Philosophy of Science:
And here is Science/Religion:
Both subgenres are drawn towards the right of the PCA space. In the upper quadrant we find LATs such as, Confidence, Uncertainty, Question, Contingency and Common Authorities, types of language that may be used in the service of discursive writing. For example, here is an extract from Bacon’s Novum Organum (1676):
The common occurrence of Subjective Perception (i.e. observation that tells us as much about the perceiver as the perceived) throughout this text is also a feature of the right side of the PCA space, where we find language that deals with subjectivity, inner thought and the disclosure of personal opinion.
In the Science/Religion visualization above, an extreme outlier in the upper right quadrant is marked in dark purple. Compared to the rest of the corpus, this text scores very high on LATs such as, Private Thinking, First Person, Self-Disclosure and Uncertainty. Such a result may not be so surprising when we discover that this text is Descartes’ Meditations (1680); but it does, perhaps, indicate a gulf in scientific style, whereby the inquiring ‘subject’ begins to feature almost as much as the ‘object’ of inquiry:
What Next?
The notion of subjectivity raises some interesting questions about the scientific texts we see drawn into the upper right quadrant, compared to the rest of the corpus. For example, does this indicate a more modern, or individualistic approach to the study of science? If so, how do we square emerging scientific ideas of objectivity and ‘matters of fact’ with the subjective perception of the individual who reports these facts?
To begin answering these questions, we aim to examine a group of texts that are commonly thought of as ‘defining’ modern scientific discourse, against those that are/were considered archaic – namely the writings of the Royal Society and the literature of the Occult.
Watch this space.