The History of the Indo-European Languages Lives in your Mouth
Yeah, English has a bunch of Latin and Greek words. But that's not the cool part
Not Much Latin
People like to share that much of our English vocabulary derives from Greek and Latin. It’s why you can eat and consume, drink and imbibe, understand and comprehend, want and desire, grow and evolve, choose and decide. You’ll read factoids claiming that anywhere from 60% to 75% of English words come from these languages. I’ve even seen some present English as a mixture between a Germanic language and Romance one. These figures oversell the foreignness of English. To illustrate this, consider this paragraph from the Wikipedia page on organic chemistry
Aromatic hydrocarbons contain conjugated double bonds. This means that every carbon atom in the ring is sp2 hybridized, allowing for added stability. The most important example is benzene, the structure of which was formulated by Kekulé who first proposed the delocalization or resonance principle for explaining its structure. For "conventional" cyclic compounds, aromaticity is conferred by the presence of 4n + 2 delocalized pi electrons, where n is an integer. Particular instability (antiaromaticity) is conferred by the presence of 4n conjugated pi electrons.
I didn’t check each one, but plenty of words in here give me Latin or Greek vibes: aromatic, hydrocarbon, conjugated, hybridized, benzene, and delocalized. In a dictionary, each word takes one slot. That’s how we get the inflated counts of foreign words in English. Let me ask you though, how often do you say “antiaromacitiy?” Even the more common load words, ones like desire and consume, don’t often reach the front of people’s mouths. Many of us spend so much time with the written word that we begin to see it as the true version of our language. Here’s a challenge: eavesdrop on a casual conversation and listen to the word choices. Speech does not sound like writing. For one, we speak in short sentences. We also prefer simple words and don’t mind repeating them. In an article like this, I might find too many instances of the word “see” and change some of them to “view,” “glimpse,” or “identify.” In regular conversion though, no one seems to see an issue with repeating “see” and “saw” over and over. Sorry, I meant to say perceive an issue. I gotta look smart here.
While the total count points toward a majority of loan words, our spoken language remains heavily Germanic. Think of our most used words. The articles (a/an/the) are Germanic, as are the pronouns. Our numbers, except zero, are Germanic. Our most basic verbs, including “have,” “get,” and “be,” are Germanic. Names for basic things from “home” to “water” to “cat” to “things” itself are Germanic. I haven’t seen identified a definitive source on this, but most estimates show that at least 80% of our 100 most common words are Germanic. The Horse, The Wheel, and Language1 noted that upwards of 90% of our core vocabulary (i.e., words for concepts that almost every language has words for) is Germanic. One can find exceptions where the Latin version of words gained prominence over the native Germanic one. The Latin “army” replaced the native “here,” as one example. In some instances, the Latin and Germanic versions gained different meanings, like the Latin “animal” and Germanic “deer” or the Latin “person” and the Germanic “man.” Still, when you talk, you're mostly using Germanic words.
Now that we established the Germanic-ness of English, we can appreciate the presence of those less common Latin words. You'll often notice that the basic version of the word is Germanic, while the fancier ones come from Latin. You have a heart, but if something goes wrong, you'll see a Cardiologist. Two firms can gain a duopoly. When put our feet on the street, we become pedestrians.
Unfortunately, the discussion often stops here. Loan words, in and of themselves, aren't that neat. Japanese features a ton of Mandarin words. The Uralic languages (Finnish, Estonian, Hungarian) took something as basic as pronouns from their Indo-European neighbors. Yet, you won't learn much about the history of Japanese by analyzing its Chinese words. In English, however, a peculiar set of circumstances allow the Greek and Latin words often act as fossils of the older versions of our language.
One Large Steppe
A wide variety of languages: English, Albanian, Persian, Armenian, Russian, Hindi, and dozens of others derive from a single language spoken in the Russian Steppes about 5000 years ago. Linguistics call this language Proto-Indo-European (PIE.) Whenever you see “Proto” in front of a language name, it means that the language remains unrecorded. We shouldn’t think of these languages as something like Mycenaean Greek or classical Latin. Theoretically, one could study those languages today, head into a time machine, and speak to the locals. That’s not really the case for PIE. Instead, Linguists derive PIE by comparing the words of its successors and piecing together the common ancestor. This “comparative method” deserves its own article, so I won’t delve into the details here. Rather, it’s worth emphasizing that a PIE dictionary probably doesn’t represent a single language spoken at a single point in time. Imagine an English dictionary that contains some modern words, some Old and Middle English words, New Zealand-specific slang, technical jargon only used by electrical engineers, Twitch-streamer meme-words, and a few Dutch words that kinda look like English ones. This dictionary would provide a wealth of meaningful information about English, though it wouldn’t represent how any one person speaks.
The ancient Yamnaya people, known for their large, cylindrical kurgan graves, spread their culture across Europe, the Middle East, and India. This did not involve an epic conquest. Rather, The Horse, The Wheel, and Language compared the spread of PIE to franchising. The Kurgan builders created numerous outposts among the old societies of these regions and slowly dominated through economic and cultural hegemony. “No conquest” doesn’t mean “no violence,” to be clear. Cattle raiding was a pastime, and possibly a male initiation ritual, among the early Indo-Europeans.
As a result, the spread of the Indo-European languages occurred in multiple waves. The Germanic languages spread through the Usatove culture, while the Italic and Celtic ones spread through the Corded Ware cultures a few hundred years earlier. Greek found its current home after a windy trip through Anatolia. In the most literal sense, then, Greek and Latin aren’t any older than English or Swedish. Each language amounts to a branch of the older mother tongue. But that’s not the whole truth. In one important way, the Greek and Latin words are older than the Germanic ones. A few things changed among those Germanic tribesmen, and those changes continue to echo through your everyday speech.
A Grimm Perspective
Before proceeding, it might help to review some phonology. I recommend reading the detailed explanation here, but I’ll highlight some key concepts below.
Voiced consonants involve the vibration of the voice box. Put your hand on your throat while saying “guh” and “kuh.” You’ll notice some extra action in the first syllable, since “g” is a voiced and “k” is unvoiced.
Speakers can articulate a consonant in a variety of ways. The three most common manners of articulation are fricatives, stops, and nasals. Fricatives constrict the airway to create a hissy sound, as in an “s,” “v,” or “sh.” Stops temporarily block airflow before releasing it, as you noticed with the “g” and “k” above. Nasals force the air to flow through the nose. Hold your nose while saying “fffffff” and “mmmmmm,” and you’ll feel your nose only widens for the second one. That’s the nasal consonant.
I didn’t explain this in the previous article, but some constants are aspirated. Hold your hand in front of your mouth and say the words “stop” and “pay.” We spell both words with a “p,” but you only feel an outward breath for the “p” in “pay.” If you think that’s caused by the vowel, say “a.” You’ll notice that very little air made it to your hand. That breathy version of “pay” is called an aspirated consonant.
Two changes separate the Germanic languages from their Indo-European cousins. One change is called Verner’s law, which added voice to certain fricative consonants. This changed some unvoiced “th” sounds (as in “thing”) into voiced ones (as in “that,”) and some “s” sounds to “z” sounds. Other changes involve consonants we no longer use in English, like the guttural “g” sound you hear in Dutch or those Spanish consonants that sound like a cross between a “b” and a “v.” The other major sound change is known as Grimm’s Law. You’ve heard about this one if you’ve read any popular linguistics or ended up in a conversation with me that lasted more than ten minutes. Grimm’s Law consists of three changes:
Voiced, aspirated stops lost their aspiration. Notice that you feel the breath in “pay” but not in “bay.”
Voiced stops became unvoiced. Think “g” to “k” or “d” to “t.”
Unvoiced stops became fricatives. This shift might seem less intuitive, but I’ll explain one prominent example in the subsequent paragraphs.
You might not see many similarities between “Heart” and “Card,” but Grimm did. Let's ignore vowels (those change all the time) and our annoying preference for starting Latin and Greek words with a “c.” We’ll start with the Germanic “H_rt” and the Greek “K_rd”. Let’s see how we apply Grimm’s law and turn the Greek version into the Germanic one.
Rule (3) says that voiceless stops become fricatives. Notice that the “k” sound occurs at the back of the palate. We don’t have any fricatives that occur at that exact spot, but there’s one that’s pretty close: the “h.” That gets us to “H_rd.”
The “r” is neither a stop nor fricative, so it’s a keeper.
Rules (2) turn the “d” into a “t.” And we’ve reached “H_rt”
Let’s return to the “p” sound in “stop.” Today, you won’t hear that at the start of an English word, since those unaspirated stops joined Team Fricative. When I first learned about this, “p” to “f” seemed like a bit of a leap. It makes a bit more sense, though, if you pay attention to the location of articulation. We articulate the “p” at the lips. Notice the dramatic lip movement associated with this sound, as well as the “b” and “m” sounds. The “f,” meanwhile, swaps the top lip for the top layer of teeth. We can see this change in many Germanic and Greek/Latin pairs. You can see pescatarians eating fish, father’s going on paternity leave, pyromaniacs playing with fire, and feet receiving pedicures. This change allows us to see pelt, flesh, poultry, and fowl. Greek gave the prefix “poly,” but we often stick with the Germanic “full.” Wait a minute… “prefix” also starts with a “p”. That’s because Latin maintained the original consonant in pre-, while the Germanic tongues changed it to for.
We can find Grimm’s law in tons of English words. It’s not just the one labio-dental fricative that we saw in the previous paragraph. Oh, sorry, do you not know what “labio-dental” means? Grimm can help. “B” is a voiced stop, meaning it should have changed to an unvoiced “p” in English. It did, and we say “lip” today. “Dental” contains two precarious sounds: the “d” and the “t.” We can devoice the former, and change the latter to a fricative. That gets us pretty close to the English “tooth.” The “n” didn’t make the cut, but you can see it in Dutch “tand” and German “Zahn.” Make the same changes, and you can see the connection between “duo” and “two,” as well as the one between “trio” and “three.” Words like “capture” and “captive” might feel native, but even they represent old-school versions of our native words. Change that “c” to an “h” and the “p” to an “f,” and the prefix “cap-” becomes “haf.” Voice that “f” to find the English “have.” A similar change turns a magnificent Latin cornucopia into a mundane English horn.
Grimm’s law can also explain some missing cognates. You see the Greek root for movement in words like “kinetic” and “kinesiology.” Grimm’s law tells us that the English version ought to start with an “h.” Can you think of it? Probably not, unless you use the archaic word “hight.” Yes, Grammarly, I spelled that right! The Dutch and the Germans have kept this root with the verbs “heten” and “heissen.”
As a side now, we can even see some of the changes that occurred in other languages. Many of English’s words came not from the original Latin, but from its strangest decedent: French. Notice that English contains two versions of “100:” the native “hundred” and French “cent” that we see in “percent.” In old-school Latin, you’d hear that “c” pronounced as a “k.” So far, so good. A Latin “k” alongside a Germanic “h” aligns with Grimm’s law. However, we don’t pronounce it as a “k” in words like “century.” What happened?
Notice that some Latin words maintain the original “k” sound, like “captive,” How does “captive” differ from “century?” Recall that we make the “k” sound in the back of our mouths. That’s not a bad location for first vowels spoken in “captive” or “cornucopia.” It’s less ideal for the one in “civil” and “cent.” We articulate those vowels in the top and front of our mouths. In other words, these vowels sit about as far as possible from the “k” sound. French speakers saved some tongue movement by converting the k” to “ts.” Later on, the “ts” simplified to the “s” sound that we hear today.
I wish the popular understanding of English included more of this information. No, modern English is not a mixture between Old English and French. Listen to a conversation and you’ll hear “foot” a lot more than “ped-.” No, the mere presence of Latin and Greek loan words doesn’t make English unique. Rather, English boasts a combination of
Containing tons of loan words
From the same language family
and having undergone major consonantal changes that did not occur among the loan words
As a result, a single sentence can act as a rock layer, showing the different parts of our history in a single view. That, I think, is a more interesting than counting the number of words that come from another language.
This book is the source of basically all history in this article
I like the analogy of the rock layers in the sentences!
This was fascinating, Klaus! Grimm’s law may be the explanation for why whenever I fly KLM and they make the announcements in Dutch, they sound to me like English being slightly garbled by a faulty loudspeaker. What I’m hearing as garbling is actually the consonants that haven’t shifted all the way to English.
I also like your point that spoken English is almost entirely Germanic, which makes me think of John McWhorter’s observation that while English has formal and informal written registers, formal spoken English is very rare. I wonder whether formal spoken English is more Latinate, like the written version?