Charles Kurzman

Skip Navigation
  • 2013
  • 2014
  • 2015
  • 2016
  • 2017
  • 2018
  • 2019
  • A Century of Acceleration
  • Acknowledging State Terrorism
  • Americans’ unfavorable attitudes toward Muslims since 9/11, by party affiliation
  • Arab Spring
    • Winter Without Spring
  • Big Data and Mega Corpora in Middle East Studies
  • CHHS-2019-02-19
  • Confidentiality of Human Research Subjects, 1927
  • Covid Disparities in the United States
  • Crossword Cosmopolitanism
  • Death Tolls of the Iran-Iraq War
  • Forecasting
    • When Forecasts Fail
  • From Brain Drain to Brain Flush
  • International Education
    • Crippling International Education
    • How Global Is K-12 Education in America?
    • Sources and Data
  • Introducing Powerblindness
  • Making the United States Plural Again
  • Middle East Studies at Carolina
  • Nativism, Then and Now
  • Online Catalogs and the Invisible Heritage of Arab Libraries
  • Panopti-Claus
  • President Oligarch
  • Prosecuting Mass Destruction
  • Racial Inequality in the United States
  • Rightwing Postmodernists
  • Syria’s Four Revolutions
  • Syria’s Human Development Crisis
  • Teaching Middle East Crises
  • Ten Unfortunate Place-Names
  • Thank You, Civil Society
  • The Destruction of Syria
  • The Disparaging Implications of Strategic Location
  • The False Premise of Travel Ban 3.0
  • The Man Who Broke the World
  • The Meth Vote
  • When Republicans Needed Muslim Allies
  • Who Polices the Police?
  • Women’s Assessments of Gender Equality
  • #4 (no title)
  • Bio/Contact
  • Democracy Denied
  • Iran
    • Hard-Liners Agree: Good Riddance to Iran Nuclear Deal!
    • The Khomeini Wanna-Be
  • Curriculum Vitae
  • Islamic Parties
  • Islamic Terrorism
    • Chasing the Ghosts of Violent Extremism in the Middle East
    • Data Sources and Suggested Readings
    • Gremlins of Terror
    • Prism’s Paltry Yield
    • The Heresy of the Hijackers
    • The Rights of Roamers
  • Liberal Islam
    • Liberal Islam Web Links
  • Modernist Islam
  • Teaching/Advising
    • Teaching the Middle East in 10 Quiz Questions
    • Social Theory
  • Muslim-American Terrorism
    • Annual Report
    • Press Release, March 9, 2011
  • The Missing Martyrs
    • Q & A on The Missing Martyrs
  • World Peace
    • Will $500 Billion Make America Feel Secure?
Home » Crossword Cosmopolitanism

Crossword Cosmopolitanism

Piscop_NYT_special_crossword_puzzle_2016_02_07Charles Kurzman, “What Crossword History Tells Us About the Language We Use,” New York Times, February 7, 2016. “We are more parochial than our grandparents’ generation, according to one indicator: The New York Times Crossword Puzzle. With the permission of Will Shortz, the Times’s puzzle editor, I recently downloaded all of the newspaper’s crosswords, from February 1942, when the puzzle began, through the end of 2015. I created an algorithm to search all 2,092,375 pairs of clues and answers for foreign language words and place names outside the United States.” More…

I want to thank Will Shortz for allowing me to download the data and agreeing to be interviewed, and Jeff Chen and Jim Horne at Xwordinfo for making the puzzle data available.

I also want to thank Josh Katz for the graphics that went along with the article (he and the Times graphics department picked these examples, not me!) and Fred Piscop for the great puzzle that was created for the piece – unfortunately, his name did not appear in the print version of the newspaper.

For those who are interested, here is how I generated the data presented in the article. To identify foreign words, I created a Python dictionary of 2,000 common words in each of four languages — French, Germanan, Italian, and Spanish — and one thousand words in Latin, plus words for seasons, months, days of the week, and numbers 1-20 in various languages, as well as the names and abbreviations of languages (such as “Sp.” and “Ger.”). I removed words that overlap with lists of 5,000 and 18,000 English words.

I then used Python 2.7 to check each word in the crossword puzzle clues and answers for matches in the dictionary, and then hand-coded the results, as well proper-noun objects of prepositional phrases. I don’t think that more sophisticated methods such as topic modeling will work for such short snippets of text, but I encourage others to try.

Here are the results, showing small distinct shifts with each new puzzle editor:

Kurzman_Foreign_Languages_in_the_New_York_Times_Crossword_Puzzle_1942-2015

The overall trend is driven largely by the rise and fall of French and Latin, whose appearance in the puzzle was surpassed by Spanish about a decade ago (this graph is smoothed with a Lowess procedure to even out annual fluctuations):

Kurzman_Selected_Foreign_Languages_in_the_New_York_Times_Crossword_Puzzle_1942-2015

Clearly, the dictionary I used does not include all foreign words. Adding more languages dramatically increased false positives, especially with proper names. But from spot-checks, I don’t think the missing words affect the trends much. Let’s look at the five most common foreign-language words clued as Arabic (aba, abou/abu, alif, ameer/amir/emeer/emir, wadi), Chinese (amah, fantan, sampan, taa, tong), Hebrew (aleph, omer, seder, tav, yom), Japanese (hai, inro, obi, sumo, sushi), Russian (artel, duma, kulak, mir, nyet), and Ottoman/Turkish (aga/agha, asper, imaret, irade, pasha). These words don’t appear very often in the puzzle, and most of them follow the same pattern as my foreign-language dictionary words (smoothed with a Lowess procedure):

Kurzman_Top_5_Words_in_Selected_Foreign_Languages_in_the_New_York_Times_Crossword_Puzzle_1942-2015

Speaking of Arabic, the article should have mentioned the first appearance of Middle Eastern foods, such as pita in 1985 (“Mideastern bread”) – 51 previous usages of pita were related to agave fiber — falafel in 1996 (“Falafel holders”), and hummus in 1997 (“Hummus holder”).
To generate a dictionary of place names, I aggregated lists of world regions, country names, demonyms, the 3,000 largest cities in the world, 140 famous ancient cities, and the 180 longest rivers, then removing places in the U.S.

Here are the annual rates of the international place names, again, with slight shifts between each puzzle editor:

Kurzman_International_Place_Names_in_the_New_York_Times_Crossword_Puzzle_1942-2015

The rate of West European place names (as a percent of all international place names) has declined slightly over the years (smoothed with a Lowess procedure), as shown in this dull graph:

Kurzman_West_European_Place_Names_in_the_New_York_Times_Crossword_Puzzle_1942-2015

I didn’t try to include counts of foreign people or works of art in the puzzle (although the graphics that were added to the article included some examples of people, such as Friedrich Ebert and Roger Ebert). My hunch is that works by Shakespeare and Verdi would figure prominently.

If someone would like to try, perhaps with research databases such as the Virtual International Authority File (VIAF) for people, Worldcat for books, IMDB for movies, MusicBrainz for music, the Getty Cultural Objects Names Authority (CONA) for works of art, Wikipedia for all sorts of things, and so on. Or perhaps somebody has already aggregated these and more? If so, and you are willing to share, please let me know!

Projects:

  • Arab Spring
  • Democracy Denied
  • Forecasting
  • International Education
  • Iran
  • Islamic Parties
  • Islamic Terrorism
  • Liberal Islam
  • Middle East at Carolina
  • Middle East Sociology
  • Modernist Islam
  • The Missing Martyrs
  • World Peace

About Me:

  • Home Page
  • Bio/Contact
  • Curriculum Vitae
  • Teaching/Advising