Making the United States Plural Again
Charles Kurzman, “Making the United States Plural Again,” Scatterplot blog, August 31, 2016. America may be divided these days, but it is hardly as divided as when the United States of America were plural. Over the past two centuries, the United States became singular, at least grammatically, according to my analysis of verbs in the Congressional Record and its predecessor publications — 166 million sentences between 1789 and 1989. More…
Data Sources and Methods
To determine when the United States shifted from plural to singular, I examined two sets of government documents.
The first was ProQuest’s Congressional Record file, purchased by Davis Library at the University of North Carolina at Chapel Hill. This file includes the text of the Congressional Record (1873-1997) and its predecessors, Annals of Congress (1789-1824), Register of Debates (1824-1837), and Congressional Globe (1833-1873). For symmetry’s sake, I examined the first two centuries: 1789-1989.
I removed the XML codes and broke the file into sentences, using periods as delimiters, after removing periods from a variety of common abbreviations (Mr., Mrs., Ms., Sen., Rep., U.S., state names). I then removed sentences shorter than 12 characters to get a count of 166 million sentences.
I did not try to parse 166 million sentences grammatically. Instead, I limited my search to common verbs — is/are, do/does, and has/have — and to sentences where “the United States” or “the United States of America” directly preceded these three pairs of verbs.
In order to avoid false positives such as “The history of the United States is…”, I removed common prepositional phrases with the United States as the object (in, of, by, for, with, between, against, to, from, into, throughout, toward, through), as well as “and the United States.”
Separately, I looked for the plural phrase “these United States,” regardless of the verb.
Plurals and singulars form a strong scissors pattern, with singulars overtaking plurals around 1870.
The second dataset I examined was State of the Union addresses, whose texts are stored at the American Presidency Project at the University of California at Santa Barbara. Since these texts are much shorter than Congressional debates, I extracted all 825 sentences that referred to the United States, after removing prepositional phrases, and eyeballed each of them to identify singular and plural references. I created a single percentage for each president.
Again, plurals and singulars form a scissors pattern, with singulars overtaking plurals in the addresses of Rutherford Hayes administration (1877-1880) and becoming dominant in the addresses of Benjamin Harrison (1889-1892). Grover Cleveland did not use the singular at all in his first administration, before Harrison; in his second administration, after Harrison, he used the singular far more than the plural.
(The percentage dropped in the addresses of Lyndon Johnson, who only referred to the United States three times outside of a prepositional phrase, twice of them singular and once plural. Gerald Ford also used the plural once, as compared with six singulars.)
Further checks for robustness in the future might include a search for more verbs in the Congressional Register, or a parsing of the entire corpus. Similar analyses might be done on the Presidential Papers, if these volumes are converted into plain text, or — looking beyond official documents — on all books or articles published in the United States. But someone else will have to take that on.
Update, October 6, 2017: I recently learned about Mark Liberman’s similar analysis of Supreme Court decisions, which identified a transition from plural to singular in the early 1900s; and analyses of English-language books by Ben Zimmer and Erez Aiden/Jean-Baptiste Michel, who identified the transition in the 1880s.