We had a request for a clearer version of the image we discussed in our post last year, which shows changes in the catalogued subject of Library of Congress books over the course of several hundred years. Jon Orwant from Google was kind enough to send an updated image, which we’re sharing here. I include below his description of the visualization.
“The visualization…was derived exclusively from the metadata feed of the Library of Congress. The LoC catalog doesn’t restrict itself to American or even English language books, but likely does have some sample bias (as every union catalog does).”
One Comment
If this is data from the Library of Congress, I’m going to stick to the interpretation that the massive spikes at wartime are overwhelmingly the result of their filing conventions and decisions about what to save/acquire, rather than about the history of publishing.
It’s hard to search the LOC catalog by year range and classification—at least, I can’t figure out how—but the LOC has about 140 entries from the 1640s in the subject heading “Great Britain–History–Civil War, 1642-1649–pamphlets” that are all filed under “DA”—the history category I’m pretty sure is spiking here. (Example pamphlet) Of the 150 books with the keyword ‘history’ published in the 1640s in the LOC’s catalogue, all the ones in the D’s but one are about the 1640s, primarily political pamphlets. The only ‘history’ match in the Ds I see that isn’t a civil war pamphlet, weirdly enough, is about the future, not the past. I haven’t checked the other spikes, but I saw similar patterns with pamphlets from the Civil War last year.
Basically just a case of the title promising too much, I think: LOC holdings are a reasonable (?) proxy for “all books published” in the 20th century, but before then almost all signal may be about the particular library or about library curation generally, not about what people actually read or published.