The Cognates
Can you count to ten? If not, don’t worry, I can help. The table below shows seven words for the first ten numbers in seven languages. I also threw in the word for “hundred”, as a bonus.
Before we get to the chart, I want to set some expectations:
I will write each word as phonetically as possible. “Funeticly,” one might say. To do so, I will change the spellings of several words, even the English ones.
Several languages use non-standard Roman characters (like the tailed “s” in Romanian). These were all converted to their English equivalents.
Native speakers generally write the last three languages using a non-Roman alphabet. I transcribed these in our alphabet based on their pronunciation.
I didn’t think too hard about vowels. I will keep the recommended spelling if it is close enough.
For German, I had no better ideas for the /x/ sound, the fricative version of “k.” Since there is no English equivalent, I will use the International Phonetic Alphabet.
I already wrote about the “R” sounds. Even though the English, German, and “standard” (i.e. flapped or rolled) “R” don’t sound that similar, I will transcribe them all as “r” for our purposes.
For Czech… I will try my best.
Does this list seem diverse? It isn’t. While the languages may appear different, they amount to seven descendants of Proto-Indo-European, an unattested language spoken by Eurasian steppe herders around six thousand years ago. Let me show you how all these words relate.
I’ll begin with one of the stranger members of this table: the English word for “one.” I’ve opted to transcribe it as “won” since that’s how nearly the entire English-speaking world pronounces it. The spelling matched its pronunciation before a shift around 500-700 years ago. English speakers said something like “own,” and we still hear this old pronunciation in words like “only” and “alone.” What accounted for the change? No one knows, but “oh” and “woah” involve rounded lips and relatively free airflow. Recall that “Wednesday” refers to the Norse god “Woden,” a dialectal variation of “Odin.” That’s similar to the change from “own” to “won.”
Outside of English, we see two distinct forms. There’s the “n” version in Europe and the “k” iteration in Asia. Did the original word have an “n” or a “k?” Sources differ, indicating the original word was “h1oynos” or “h1oykos.” The “h1” represents a sound articulated deep in the throat that doesn’t exist in today’s Indo-European languages.
The words for “two” provide much less mystery. This was Proto-Indo-European (PIE) “dwoo.” As I’ve written about numerous times on this blog, the Germanic languages underwent a change known as “Grimm’s Law” during the first millennium B.C. I’ll avoid listing all the shifts now, but the important one here is “d” to “t.” Whenever we find a “d” in Proto-Indo-European, we expect a “t” in English. That’s why the Latinate “card-” and “dent-” correspond to the English “heart” and “Tooth.” What’s with the “ts” sound (the actual spelling is “zwei”) in German? Hochdeutch, the German dialect you’ll see in most formal settings, shifted again. Compare English “time” to German “Zeit” (again, “z” is pronounced “ts”) or English “out” to German “aus.” A more conservative version of the word survives in the English word “duo,” which hails from Latin.
We see a similar story for “three,” which derives from Proto-Indo-European “trei” or “treyes.” Grimm’s Law tells us that a PIE “t” should correspond to an English “th,” and it does! Meanwhile, an “s” sneaks into the Persian version. Given the discussion around Hochdeutch in the previous paragraph, this shouldn’t surprise us. Both sounds are “unvoiced” (i.e., they don’t involve the vibration of the voice box” and both are pronounced at the alveolar ridge. In other words, we can see convergent evolution in the German word for “two” and the Persian one for “three.” Where, then, does the “d” in the German word for “three” come from? Grimm’s Law may have brought the “th” sound1 to the Germanic languages, these sounds tend to arrive with one foot out the door. German, Dutch, and the Norse languages have done away with it, and we can hear death knocking on its door in English. I work with a few British people who replace them with an “f” or a “t.” Americans can hear the “th→d” shift in Black English Vernacular. Think of phrases like “dat boy.”
Let’s skip 4 and 5 and head to 6 and 7. These were PIE “sweks” and “septm.” A few languages saw the “s” sound transform into a “sh” one2. Once again, German contains a similar shift: compare the English “sleep” to the German “schlafen” (the “sch” is pronounced like an English “sh.”) Linguists refer to this pattern as “palletization.” All else being equal, we prefer to pronounce consonants in the middle of our mouths. That’s “g” (back of the mouth) in Proto-Germanic “gelwaz” became a “y” (pronounced in the middle) in Modern English “yellow.” Besides this, a few languages show a movement from the back-of-mouth “k” to the front-of-mouth “s,” “t,” “ch,” or “sh.” This change occurs throughout the Indo-European Languages, as the front-of-mouth consonants ease into their subsequent vowels.
None of this, however, explains the leading the “h” sounds in Persian nor the Ancient Greek root for six (not shown above) in words like “hexagon.” The “h” sound (and similar ones seen in other languages) is one of the easiest to pronounce. Thus, we tend to convert fricative consonants (those with disruptive airflow) into “h” sounds. Luckily, linguists gave this a highly intuitive name “h-shifting.” Kidding, they’re social scientists. It’s called “debuccalization.”
We see broad agreement on “eight.” The confusing “gh” in English represents a fossil of the old /x/ sound, the same one heard in German today. “Eight” was “h3ektow” in PIE, with the “k” following predictable changes. This sound degraded into a fricative in English and German, consistent with Grimm’s Law. “Ten” (PIE: “dekm”) and “hundred” (PIE: “kmtom”) follow similar patterns.
The listed languages mostly agree on 9 (PIE “h1newn.”) The only holdout is the mystery “d” in Czech, which also boasts a mystery “d” in its version of “one.” However, this turns out to be less mysterious than it first looks. Both sounds are “voiced” (i.e., requiring a vibration of the voice box), and both are pronounced at the alveolar ridge. The main difference is that “n” is a “nasal” consonant that forces air through the nose. Lightly touch your nose while saying “do” and “new,” and you can feel the difference. While we don’t see a shift like this in the English number words, we see it in the word “number” itself! “Number” originates from the Latin “numerus” and we maintain the b-free version in offshoots like “innumerable” and “numerical.” For the base noun, however, we added a “b.” Why “b?” Well, like “d” and “n,” we pronounce “b” and “m” at the same part of the mouth (in this case, the lips), and the added “b” helps us glide into the following sounds. Czech appears to have undergone a similar change.
Finally, we need to discuss 4 and 5. It’s hard to see much similarity between the terms in that row. Luckily, English maintains both the Germanic and Latinate versions of several words. For “four,” the Latinate version is “quarter,” a word with a clearer relationship to PIE “kwetwor.” In the Iranian and Indian words, it’s the same back-to-front (k—>s) change we’ve seen elsewhere. Where, though, does the “p” come from? I discussed this a bit in my article on ancient dragon-slaying myths. The “kw” sound involves a primary point of articulation at the back of the mouth and a secondary point at the lips. That’s a lot of mouth work! Some languages seem to have shifted this sound to “p,” because doing so keeps the entire articulation at the lips. The Germanic people shifted the “p” to an “f” (again, Grimm’s Law), while the Greeks seem to have sent it toward the middle of their mouth as a “t” (again, palatalization).
Five comes from PIE “penkwe,” showing all the same sound changes we’ve seen in the other numerals. Romanian is the odd one out here, coming from the Latin “quinque.” What could explain this? Remember the kw→p shift we discussed in the context of all the other languages? The Romans took the reverse rouse, moving from “p” to “kw.”
Why did I want to show that the basic numerals are cognates in six branches of the Indo-European Language Family? One, because it’s cool. Two, because it adds to the mystery of the words that aren’t cognates.
The Missing Words
The website Eupedia compiled a list of what I’m dubbing the “Mystery Words.” These are words with no known cognates in other Indo-European Languages. None of the numerals from the previous section qualify. Instead, we see words related to animals, plants, body parts, and personal attributions. See a selection below.
Bull
Horse
Lamb
Berry
Oak
Back
Bone
Neck
Leg
Gray
Shy
House
Silver
Shallow
Some sources have suggested that Mystery Words account for up to a third of our vocabulary. Where, then, could these words come from? Before answering that, it might help to provide some background on the Indo-European Language Family and its branches.
Unsupervised Learning provides a great background on the Indo-Europeans. The story begins with hunter-gatherers from West and Central Asia colliding on the Eurasian Steppes 6-7 thousand years ago. First, a few pioneers ventured into modern-day Turkey before 4,000 B.C. Their ancestors formed the Anatolian languages such as Hittite and Luwian. While commonly referred to as the “Anatolian Branch” of the Indo-European language family (including by this blog!), genetic data indicates that they represent a sibling, rather than a child, of Proto-Indo-European. Around half a millennium later, a group of steppe herders, known as the Yamnaya, settled on the western part of the Eurasian steppes. They moved to Europe about 5,000 years ago and mixed with Neolithic farmers to create the Corded Ware Culture. This Corded Ware Culture spread through Europe, forming the Celtic, Italic, Balto-Slavic, and Germanic languages. Corded Ware ancestors migrated to Western Russia to form the Fatyanovo–Balanovo culture hundreds of years later. Their ancestors would ultimately move to Iran and India to create the Indo-Iranian and Indo-Arian branches. A separate Yamnaya migration, unrelated to the Corded Ware Culture, accounts for the Greek and Armenian branches.
In short, the Germanic branch stems from Corded Ware ancestors who settled in Northern Europe. There is our first hint of where the mystery words may come from.
The Basque, Levantine Merchants, and the I1 Haploground
The most famous work surrounding the Germanic Mystery Words comes from Theo Venneman’s Europa Vasconica, Europa Semitica. He presents two theories in the book, the first of which concerns the Basque language. Scholars agree that it remains the largest European tongue that does not belong to the Indo-European or Uralic language families. Venneman takes a further step in suggesting that it predates both of them. He considers Basque the lone survivor of the Vasconic Language Family, a group of languages spoken by European hunter-gatherers. Second, Venneman proposes that the Semitic seafarers influenced Germanic and Celtic speakers across Western Europe.
The Basque theory depends on “hydronymy” or river names. The classic work on this subject stems from Hans Krahe. He noticed that the names of European rivers follow an internally consistent pattern which is inconsistent with our understanding of Proto-Indo-European. Krahe attributed these names to “Old European,” which he thought of as the Vulgar Latin of Proto-Indo-European. These languages existed after the onset of Proto-Indo-European but before the languages separated into recognizable branches. Venneman takes this step further and attributes them to something older than Old European. Due to the presence of a front-of-mouth vowel sound that appears in the hydronyms but not in PIE, Venneman attributes these names to an ancient language related to Basque.
Next, Venneman highlights the Semitic connection. Semitic merchants brought their wares to Northern Europe during the Bronze Age. Alongside these goods, the traders brought their culture into Europe. We can see this, Venneman argues, through the cats of Freya, the god Baldur (from Semitic Bacal), and the megalithic monuments in Western Europe. Most importantly, these Near East merchants brought their language to the Insular Celtic Languages (i.e., the ones off mainland Europe) and the Germanic ones. Although the mystery word “plow” may lack Indo-European cognates, it sounds like the Hebrew “plh.” We will get to other potential Semitic loan words in a bit.
Venneman also believes that the Semitic languages influenced Germanic grammar. We create most English past tense verbs by slapping an “ed” at the end3. A few verbs, however, don’t fit this mold: eat-ate, ride-rode, and run-ran. This quirk isn’t unique to English. We find these “strong” verbs across the Germanic languages, such as trinken-getrunken and denken-gedacht in German. Where did this system come from? Venneman has an idea. In the Semitic languages, all conjugations occur by changing the vowels within a stem. In Akkadian, for instance, the root for “divide” was “prs.” Speakers changed this to “i-prus” to say “he divided” and “i-parras” to say “he is dividing.” Hence, Venneman posits that Semitic merchants learned Proto-Germanic as their second language and brought some of their first language habits into their speech. Children picked up those Semitic habits, thereby sneaking the foreign grammar into the minds of native speakers. A similar process explains why we start our questions with “do” in English.
In addition, Venneman suggests that Semitic influence may have caused Grimm’s Law. Contemporary Semitic languages featured more fricative consonants than the contemporary Germanic ones. As with the strong verbs, second-language learners brought features of their native tongue into the Germanic languages. This explains the t→th, p→f, and k→h changes discussed in the opening section.
The final potential influencer of the Germanic languages is the people of the I1 haplogroup. A haplogroup is a "genetic population group of people who share a common ancestor on either their paternal or maternal line.” I1 refers to a paternal lineage that appears across Germanic genomes. Since these people left no cultural artifacts, we can’t conclude much about them. At best, we can assume that the Haplogroup represents the hunter-gatherer or agrarian population that preceded the Indo-Europeans in Northern Europe.
We’ve got a lot going on here: river names, the Basque, the Semitic merchants, and hunter-gatherer haplogroups. Before endorsing any of the preceding explanations of the Germanic mystery words, it might help to ask one more question. Sure, a bunch of our words seem to lack Indo-European cognates. But, then again, should we expect otherwise?
The Mundane Perspective
I began this discussion with the claim that up to 30% of Germanic words lack an Indo-European cognate. If this number is accurate (and it might not be, more on that later), we could interpret this in one of two ways. One, these words could come from somewhere else, like Semitic merchants or early hunter-gatherers. Second, we could lack the historical record to determine the matching words. In a previous article, I criticized the phrase “absence of evidence isn't evidence of the absence.” The platitude needs to begin with the all-important “sometimes.” If I told you there's an elephant in your room, the absence of evidence for an elephant would indicate the lack of a gigantic mammal. On the other hand, if I told you that there's an elephant in a 100-mile radius of you, the absence of evidence wouldn’t tell you much. Which type of “absence of evidence” do the Germanic mystery words fit into? Is this an “elephant in your room” situation or an “elephant within a 100-mile radius” one?
While the Indo-European Language Family dates back to around 6,000, little of this history exists in our written records. The oldest Indo-European-adjacent writing comes from nearly 4,000-year-old Hittite tablets in Anatolia. The next oldest Indo-European documents appear further east East. These include the roughly 3,000 to 3,500-year-old Vedic Sanskrit from Northern India and the slightly younger Avestan from Iran. Greek provides the oldest European writing, dating from roughly the same era. A large gap occurs after that, with the first Italic and Celtic writing emerging around 600 B.C. Though archeologists have found some Germanic runes dating to around two thousand years ago, Germanic speakers wrote little long-form writing hundreds of years after that. Other European writing appears even later. Slavic produced its first writings in the 9th century A.D., and we need to wait until the 15th to read any Slavic or Albanian texts.
How much could we reasonably expect these historical records to align with the Germanic languages? The earliest records from modern-day Turkey don't even seem to stem from Proto-Indo-European. The aforementioned genetic evidence indicates that the Anatolian tongues descend from a sister language of Proto-Indo-European. The oldest PIE writing, Vedic Sanskrit and Avestan, emerged from a much later migration out of Europe. The Greek texts don't even share a Corded Ware ancestry. Thus, the best comparable ancient texts come from the Italic languages. That’s a good historical record, thanks to the success of Rome, but that still leaves us in the dark for the 3,000 to 4,000 years between Proto-Indo-European and the earliest Latin. Even then, we can't attribute all of Latin to PIE, given their contact with the non-Indo-European Etruscans. In short, the Germanic languages themselves only contain around a century and a half of recorded history while the languages with the longest written history have no reason to look much like Proto-Germinic. Using the classification from above, this looks more like a “100-mile radius” type of absence of evidence.
In addition to the lack of historical records, we expect language to change over time. One estimate from The Horse, the Wheel, and Language suggests that roughly 10 to 20% of a language's core vocabulary changes every thousand years. Scholars debate what qualifies as “core vocabulary,” but we don’t need to solve that here. Let’s borrow this concept as a heuristic. If we split the difference between 10% and 20%, let’s estimate that languages lose around 15% of core vocabulary every thousand years. Given the three-thousand-year gap between PIE and Proto-Germanic, we’d expect around 45% of core words to have gone missing. Even if 30% of Germanic vocabulary doesn’t correspond to other Indo-European words, our back-of-the-envelope math indicates that we should expect that number to be larger.
With that in mind, how serious are the theories from the previous section? The harder you look, the weaker they appear. Let’s start with the hydronyms. While Venneman cites the presence of an obscure vowel as evidence of non-Indo-European origins, other linguists argue that the vowel represents an ancestor of one of PIE’s laryngeal consonants. Furthermore, Venneman’s Vasconic reconstructions don’t mesh with our current understanding of ancient Basque phonology.
Even the I1 haplogroup can’t promise much. Given the limited populations involved, genetic evidence suffers from a common problem known as the “founder effect.” This occurs when a small subset of individuals create a new society that subsequently grows in population. When analyzing the later, larger population, genetic similarities might not tell us much about the broader context of their ancestry. Instead, it might just indicate the idiosyncracies of that small founding group. In other words, the I1 haplogroup may not represent any meaningful culture or civilization. It could just be the ancestry of a few families that survived a difficult winter.
The Near East theory doesn’t look much better. If the Semitic traders did make their way to Northern Europe, it would be the only place they reached without leaving any archeological records. Plus, the timeline doesn’t quite match up. Venneman has them visiting Europe at around 4,000 B.C., well before our Corded Ware ancestors inhabited the region. The allegedly Semitic words generally don’t hold up to greater scrutiny. Here are a few examples:
Venneman claims that the “folk” comes from a Semitic word meaning “split” or “divide,” in reference to military divisions. However, military divisions are a modern construct, not the sort of thing one would have heard about in 4,000 B.C.
Venneman suggests that “ward” and “wake” match ancient Akkadian words. Yet, the contact between Semites and Northern Europeans was suggested to have taken place thousands of years before the “w” appeared in the form of the Akkadian terms. In addition, it’s not clear why the Germanic people would need to borrow the concept of “waking up.” They probably could have figured out about that on their own.
Farming words like “plow” and “furrow” have plausible Indo-European cognates from other farming-related worlds. If the Germanic people did borrow these terms from the Near East, they seem to have done so without borrowing any of their agricultural methods or technology.
Venneman argues that “house” relates to an Akkadian word meaning “reed hut.” Yet, we see a word like “bitu” (meaning “house”) across the rest of Semitic languages. Why wouldn’t that have spread this word for Europe instead? Orthodox Indo-Euorropeanists believe “house” comes from PIE “skeu,” meaning “conceal.”
Ultimately, critics and supporters of the Semitic hypothesis can argue back and forth on each proposed word. That's the nature of etymology. A week ago, a couple of friends wondered about the origin of “console” in the sense of “video game console.” The best guess starts with the verb iteration of the word, meaning to comfort someone. This was later used to describe human-shaped figures that held up shelves. The meaning then extended to the shelves and cabinets themselves before landing on the more general definition of “stuff that holds things.” Machines like the NES and Sega Genesis held games, thus receiving the tag “video game console.” Within a few hundred years, the word transformed from “thing that makes you feel good” to “thing that plays electronic games.” Venneman exploited the flexibility of words in his proposed etymologies, such as the connection between “house” and “reed-hut.” While connections like these might seem tenuous, words can change drastically over short periods.
We’ve all read etymologies (like the “console” one) that explain how a word changed over hundreds (or even thousands of years). Although these are fun, it will probably elucidate more if I highlight a recent change. You’ve likely heard the term "third party" in its legal sense. One party sues another, making anyone other than those two parties—by definition—a third party. In 1952, a demographer introduced this nomenclature regarding the Cold War: the first world allied with the U.S., the second with the USSR, and “third world” allied with neither. Some third-world countries included Switzerland and Austria, but most third-world countries were not like Switzerland and Austria. These were, generally, nations with low per-capita GDPs. After the Cold War ended, the meaning of "third world" gradually shifted to “low-income countries.” Since then, its meaning has evolved further. In American politics, one can hear statements like "Senator Etheridge is turning us into a third-world country!" These critics likely believe that any policy would reduce the American GDP per capita to $5,000 a year. Instead, they’re using the term to imply corruption or incompetence, based on the idea that lower-income countries suffer from more corruption and incompetence. Given this modern connotation, some journalists even recommend against using the term "third world." In a few generations, the term has moved from mundane to profane. If “third world” can transform from “you know, some countries are neutral in this conflict” to “hey that’s a slur, you can’t say that!” in 75 years, is it really that had to imagine that Proto-Indo-European “skeu” shifted from “conceal” to “place you live in” in three thousand years?
While Venneman uses the malleability of definitions to argue for Semitic influence, the concept cuts both ways. If a Germanic word can relate to a somewhat similar Semitic word, what's stopping us from connecting that same Germanic word to anything else from Proto-Indo-European? For example, consider the word "finger." This is one of the mystery words from the Eupedia list. We know that humans have five fingers on each hand, and the words "finger" and "five" begin with similar sounds. Plus, the “-ng” and “-er” endings appear frequently in Germanic languages. Now, is this definitive proof that the word "finger" comes from the word "five"? Of course not. Rather it shows that we can find more parsimonious etymologies within the Indo-European language family. "Horse" could be a mystery word, or it could come from the Indo-European "cares," meaning "run" or "fast." "Hand" might represent a cognate of "hunt," and if we accept that "king" shares its ancestry with "ken," then it's a standard Grimm's Law cognate of the Latin root "gen," which we see in words like "general" or "gene." If we include these plausible Indo-European cognates, the proportion of Mystery Words drops from 30% to the single digits.
In addition, meanings don't always drift. Sometimes, they teleport. Consider "cool," in the sense of "interesting" or "nice," It comes from jazz singers in the 1930s, with Lester Young being the most likely pioneer of the term. This opens up the possibility that some of our missing words may come from another word with an entirely different meaning.
I'll present one last potential source of mystery words: people making stuff up. If you've followed internet slang, you've probably seen the term "pog," a catchphrase of Twitch streamer Ryan Gutierrez. Though Mr. Gutierrez may consider himself hip and cool, I regret to inform him that he's over half a millennium behind the times. In the 13th and 14th centuries, a bunch of similar words entered the English lexicon with the formula of [plosive]-[short vowel]-[g]. Examples include pig, hog, log, big, and, most famously, dog. Likewise, carnival workers invented the word "scam" in the 1960s. They considered this novel collection of sounds to be uniquely befitting of their operation, and everyone since has agreed.
We also don't need a fancy explanation for Grimm's Law. I highlighted the German cognates in the first paragraph for a reason. That language underwent a consonant shift similar to Grimm's law only a millennium or so after undergoing the original Grimm's Law. Likewise, Armenian underwent a similar consonant shift while Greek also saw unvoiced stops devolve into fricatives. Meanwhile, various branches of the Indo-European language have seen k-like sounds turn into s-like ones.
A similar explanation works for strong verbs. While this system does share similarities with Semitic grammar, it shares even more similarities with Vedic Sanskrit. Scholars hypothesize that both come from an ancient ablaut system in PIE. The Germanic languages could be the only ones maintaining a system that the other branches left behind.
That opens up one more possible explanation of the Germanic mystery words. Imagine I asked various English-speaking people for the opposite of "slow." The answers may look like this:
US: fast
Canada: fast
England: fast
Australia: fast
New Zealand: fast
Scotland: snel
Without further context, we might ponder the origin of this Scottish Mystery Word. Where did it come from? Semitic merchants? Neolithic farmers? The aliens who built Stonehenge? Before crediting anything to the extraterrestrials, it might help to look at a couple more data points. Here are the synonyms for “fast” in other Germanic languages.
German: schnell
Dutch: snel
Swedish: snabb
Scottish English: snel
All Other English: fast
Suddenly, there’s no Scottish mystery word. Instead, there’s an “every other variety of English” mystery word!4 The same could apply across the branches of the PIE. We may have a different word for “leg,” but is that because we changed or because everyone else did? One study indicates that the Germanic branch has some of the most cognates with Proto-Indo-European. It trails only Indo-Aryan and Greek, and those branches have almost two thousand more years of writing.
Given this, we can provide mundane explanations for the Germanic Mystery Words: the dearth of historical records, normal linguistic drift, conservation of words that died in other branches, and novel coinages. None of this refutes any of Theo Venneman's fun theories. The mystery words could result from a combination of contact with non-Indo-European speakers and boring stuff. I'm merely pointing out that these enthralling theories aren't necessary. We don't need to craft any imaginative theories for the Germanic Mystery Words.
… but if we want to, I have one idea.
Onomatopoeia Theory
Consider these mystery words:
Lamb
Ram
Bull
Sheep
Neck
Back
Yes, I cheated a little bit by listing the modern English version of the words. The Proto-Germanic ones contained noun endings like -az and different vowel sounds. Still, they’re close enough for me to make my point: each of these words seems like they could derive from onomatopoeia. Although we transcribe a lamb’s speech as “baaaaa,” “laaaaamb” sounds equally plausible. “Ram” and “bull” remind me of comic book words like “bam” and “boom,” both befitting of powerful beasts. “Sheep” reminds me of “chip” and may reflect the sound of cutting fabric. Finally, both “neck” and “back” sound like “crack,” the onomatopoeia most associated with these body parts. I acknowledge, based on the preceding paragraphs, that the Onomotopeia Theory remains unnecessary. Regardless, I considered it plausible enough that I didn’t want to leave my readers without it.
In English, “th” actually represents two sounds. Compare “this” and “thing.” I’m speaking of the latter here.
English “ch” is just “t” + “sh”
It’s actually more of a “t” or “d” sound, but that’s not important for right now.
For the curious, fast originally meant “firm, as in “fasten your seat belt.” The meaning drifted either from having the connotation of “vigorous” or from a fast runner sticking “firm” to what he’s chasing.
Photo credit: https://www.vecteezy.com/photo/2886087-flock-of-sheep-in-portugal