My Zimbio
Top Stories Share

Google+ Badge

Tuesday, February 28, 2012

Dan K's Blog

Dan K's Blog

What am I looking at?
This map shows the breadth of vocabulary of each of the nation’s 435 voting Representatives. A darker green color indicates that a Rep has a larger vocabulary. There are 6 non-voting Representatives excluded from the map, though they are included in the rankings. The map is colored by each Rep’s “SQPD Ranking” (more details below), which measures a Representative’s usage of 3,393 different words that might be found on the SAT.

What are the numbers that appear when I click a district?

  • Word Diversity Rank reflects how frequently a rep uses SAT words vs. usage of a basket of common words. The common words include: ‘are’, ‘one’, ‘they’, ‘was’, ‘were’, and ‘with’. The basket was specifically designed to be non-partisan. Oddly enough, some common words—even ‘one’, for example—show what appears to be a slight partisan tilt. [Teaser: subject of a future post.] The highest word diversity rank belongs to none other than Ronald Earnest Paul (aka Ron Paul).
  • SAT Words Used (Count) reflects the number of different SAT words used by a Representative. The highest total (1,178) belongs to Sheila Jackson-Lee (D-TX). Not only is she highly educated, but she’s also been in office since 1995.
  • SQPD Index, or the “sesquipedalian index”, is an aggregate measure of a Representative’s vocabulary. It includes both the SAT Words Used (Count), and Word Diversity Rank. Slightly higher weight is given to the count. Both factors are important, though, since certain Reps have a much, much longer record than others.
  • SAT Words Used (Total) is the total number of SAT word utterances. In other words, if 10 SAT words were said 20 times, a Rep would have a score of 200. The highest total (5,027) also belongs to Sheila Jackson-Lee.

Who have the 10 highest scores?
In order, the Representatives with the 10 highest SQPD Index values are as follows:

  1. Sheila Jackson Lee, (D-TX): 4.79
  2. Christopher H. Smith, (R-NJ): 2.98
  3. Dennis Kucinich, (D-OH): 1.00
  4. Steve King, (R-IA): 0.99
  5. Barney Frank, (D-MA): 0.97
  6. Charles Rangel, (D-NY): 0.95
  7. John Conyers, (D-MI): 0.92
  8. Ronald Ernest Paul, (R-TX): 0.87
  9. Steny Hoyer, (D-MD): 0.84
  10. Marcy Kaptur, (D-OH): 0.84

What data sources were used to create this map?
The Congressional Record is the “official record of the proceedings and debates of the United States Congress.” The Sunlight Foundation, through its Capital Words API, processes the Congressional Record every day. They currently maintain data from 1996 onward. You can query word or phrase frequency by political party, over time, by representative, and so on.

The list of SAT words was obtained from two sites, and There were 5,372 words on the original list, although some of them hardly seem like SAT words (‘further’, ‘believe’, ‘off’), while others are mainstays of Congressional dialogue (‘foreign’, ‘unanimous’, ‘bureaucracy’). So, I excluded all words said more than 500 times from the final list, leaving 3,393 SAT words. The most common of these are ‘barring’, ‘culprit’, ‘passive’, and ‘revert’.

Finally, the map itself was created using Google Fusion Tables. Here’s a link to the table. Enjoy!

Related Posts:

100 Words Said Only Once In Congress

I’m working on an analysis of the vocabulary of members of Congress. This effort has already spun off a number of interesting byproducts, which I’ll be sharing over the next week or so. The first is what one might call a “Congresswhack”, if one were familiar with Googlewhacking (a two-word Google query that produces only one result.)

The analysis use two primary data sources: a list of 5,372 “SAT Words” obtained from and, and Congressional word-mention data from Capitol Words. Capitol Words scrapes the Congressional Record (the “official record of the proceedings and debates of the United States Congress”) on a daily basis, and calculates word or phrase usage over time, by legislator, by party, and so on. They also offer the Capitol Words API, which is what I used for my analysis.

So, without further ado, here are your words of the day. Each of these words has been said on the record once since 1996:

50 words said only once since 1996, by a Republican:
acerbity, Achillean, anagram, antonym, astringent, betrothal, boorishness, chromatic, consignor, contumacious, cynosure, deviltry, diaphanous, diffusible, discomfit, doublet, exegesis, fishmonger, habitant, heptagon, homologous, homonym, ichthyology, impetuosity, kinsfolk, knead, languor, mealymouthed, mettlesome, ministration, necrology, negligee, occlude, panegyric, pommel, precipitant, preponderate, protuberant, ramify, resistive, ruffian, sedulous, sinuosity, somnolent, superfluity, testator, theism, transfigure, upbraid, vegetate

50 words said only once since 1996, by a Democrat:
dowager, churlish, lithe, peevishness, soporific, ambidextrous, apposition, arboriculture, Arthurian, bedeck, brooch, cajolery, Calvinism, cauterize, coquette, cosmopolitanism, darkling, dentifrice, desiccant, encamp, fondle, foppish, frizz, furrier, garrote, imbibe, impecunious, lexicography, luxuriant, lyre, mawkish, mendicant, messieurs, noisome, obtrude, pertinacity, photoelectric, piteous, poesy, precession, protuberance, pungency, recrudescence, repine, retch, sapient, secant, swarthy, syllabication, volant

Related posts:

Buzz this

No comments:

Post a Comment