Please Stop Publishing "Oldest Languages in the World" Articles

Inside, find the true (and deeply unsatisfying) answer to the "oldest language" question

Feb 05, 2024

Is Proto-Pama-Nygungan the Oldest Language?

Australia’s first inhabitants reached the continent around 65,000 years ago. The current aboriginals, however, descend from a migration that occurred around 50,000 years ago. Back then, the chillier climate lowered sea levels, leaving New Guinea and Australia as a single continent. Different groups journeyed south, and each claimed a different spot on the Australian coast. Unlike Europe, no animal herders or farmers swept through the continent in the coming centuries. The populations stayed nomadic and kept distinct genetic profiles.

Most of the remaining aboriginal languages belong to the Pama-Nyungan Language… Thing. We can’t quite call it a language family. Remember, the term “language family” implies they descend from the same language. English and Bengali derive from a language spoken in the Russian steppes around 6,000 years ago, so they’re both part of the Indo-European Language Family. Some scholars believe that Pama-Nyungan languages stem from a roughly 6,000-year-old Proto-Pama-Nygungan language, while others think the speakers merely borrowed words from each other. I don’t have the answer to this question. I’m gonna be honest, guys. I read some Academic papers for the Finland article, but those last sentences come from Wikipedia. There’s not much information on this front.

You probably see what I’m getting at. Are any of these Pama-Nyungan languages descendants of the people who populated the continent 50,000 years ago, or even those from 65,000 years ago? Is one of these languages the oldest language in human history? Before answering the question, let’s think how one would answer this in the first place.

How do we study the origins of modern languages? The best form of evidence is ancient writing. We know that the Romance languages came from Latin because we have written Latin. Easy. Another method is to observe a large number of modern languages with noticeable similarities. These languages need to be similar enough for us to notice cognates and grammatical commonalities, yet different enough to estimate how long ago they diverged. Although no one wrote down Proto-Germanic, it’s easy to see the similarities between modern English, German, and Swedish. For example, each language features a similar slate of “strong” verbs that alter their stems in the past tense. In an ideal scenario, we can study both ancient texts and linguistic diversity. French and Bengali don’t sound much alike, but Classical Latin and Vedic Sanskrit are a tad closer.

Where does that leave us in Australia? We can’t tell if the modern languages derive from a single ancestor. Do we have any writing? Well, the oldest writing dates back to 3500 BC. No, not the oldest Australian writing; the oldest writing in all of human history. So, uhh, off to a bad start there. Even worse, the native Australians didn’t use any writing system before British colonialism. Unless future linguistics develop some groundbreaking new technique, we’ll never if any these Australian languages boast a 50,000-year history.

The Left Arm Problem

That’s the first issue with publishing an “oldest language” article: languages with unknowable histories cover the globe. The second issue involves making incoherent claims about different branches of the same language family.

Take his one:

3. Greek – 1450 BC (circa. 3500 years old)
Moving forward just a hair in time, Greek is probably the oldest language still spoken as a primary, day-to-day language. While Modern Greek has evolved significantly from the Greek spoken in ancient times, the language of Greece today is a definitive descendant of the language of Homer and those who came before him… way before him.
[…]
7. Farsi – 522 BC (circa. 2500 years old)
While not the earliest known language in the Indo-Iranian language family, Farsi is the longest surviving spoken language of the Iranian family of languages. It takes its roots from Old Persian, which was first attested somewhere between 522 and 486 BCE.

Both languages belong to the Indo-European Language Family. In other words, we can think of them as dialects of Proto-Indo-European. Asking whether Farsi is older than Greek is a bit like asking whether your left arm is older than your right one. They were born at the same time. What we today call “Greek” or “Farsi” is an arbitrary cutoff placed on top of the complexity of the Indo-European Language Family. Ultimately, both share the same 6,000-year lineage.

If that seems wonky, let’s think about English. Scholars usually set the cutoff for modern English at about the 15th century. The oldest English writing dates back to the 6th century, while Proto-Germanic was spoken around 2000 years ago. So, how “old” is English? If only modern English counts, then we should consider it a 600-year-old language. If we categorize English, German, Swedish, and Icelandic as dialects of Proto-Germanic, then we’d place the age at 2,000 years. In practice, of course, no one does this. You probably haven’t seen or heard the term “Proto-Germanic” outside this blog, and no one says that Chaucer didn’t speak English. When we use the term “English,” we’re generally setting the cutoff at those first written documents in the 6th century. I’m not asking us to change that, but we should recognize it as an arbitrary distinction.

You might object to the idea of categorizing English as a dialect of Proto-Germanic by pointing out that modern Germanic languages are mutually unintelligible. That’s true, but how do you think a conversion between Aristotle and a modern Greek would go? They couldn’t understand each other either, yet we placed both tongues under the “Greek language” umbrella. That’s fine, we need imperfect labels, but we should recognize that this label exists for historical, geographical, and cultural reasons, not linguistic ones. The labels are made up, the language family is real. Refer to that language spoken in 6th century BC Iran as “Persian” if you want, but it doesn’t make the one spoken in 21st century Tehran any older than the ones you hear in Amsterdam or Tirana.

Other articles rank the oldest writing. This represents either clickbait (“oldest language” does better SEO than “oldest writing”) or a genuine misunderstanding of how language works. Written language is an invention used to represent the spoken word. Thousands of modern languages lack a written form, let alone all the ones spoken in 50,000 BC. Again, think of English. The oldest English writing arrived in the 6th century. That doesn’t mean the Angles ran around yelling “bar bar bar” before then.

I can appreciate the impulse to write these articles. I’m reminded of President Truman’s desire for a one-armed economist, as all the other ones seemed to say “on one hand, but on the other…” Just for you, dear readers, I will tell you the best possible answer to the “What is the oldest language” question. However, I’m gonna need to set some ground rules. Rule one: I won’t differentiate between languages within the same family. I won’t try to argue whether Persian, Greek, Sanskrit, or Albanian is older. They’re all Indo-European, so they’re all 6,000 years old in my book. Rule two: the language family’s age range must be accepted by the relevant experts. Fringe scholars have presented various Nostratic hypotheses that argue that vast swaths of the world’s languages descend from some super-duper ancient language. Any of these ideas could be true, but we lack any way of confirming them with current methodologies. With those caveats out of the way, I can now answer the question we’ve all been waiting for. The world’s oldest language is… the Afro-Asiatic Language Family.

The Present

Source: https://www.library.ucsb.edu/oxford-handbooks-linguistics. Also the source of any historical linguistics in this section, unless otherwise noted

Most Americans associate the word “Africa” with the southern part of the continent and the word “Asia” with its eastern region. Naturally, then, the Afro-Asiatic Language Family refers to northern Africa and western Asia. It breaks down into five branches: Semitic, Berber, Cushitic, Egyptian, and Chadic.

You’re probably familiar with the Semitic branch. This includes Arabic and Hebrew, alongside various African languages (often called “Ethio-Semitic”) like Tigrinya. It also encompasses many of the Near East’s ancient languages, such as Akkadian, Eblan, and Aramaic. It’s probably easier to specify the languages of this region which weren’t Semitic. Sumerian, the source of our oldest written texts, does not belong to any living language family. To the North, the Anatolian Hittites spoke an Indo-European language.

The Cushitic branch includes about 70 languages in Sudan and Ethiopia. Unfortunately, this branch of the Afro-Asiatic language family lacks any historical writing, and much more linguistic work is needed before we can trace its history. Similarly, the Berber branch is spoken by people in the Sahara desert, as well as a few disparate people in northwest Africa. Although they’ve left us with some written records, their history remains fuzzy.

Meanwhile, The Egyptian provides an immense historical record. While you’re probably familiar with hieroglyphics, you might not know so much about the language that these pictographs represented. The language’s earliest writing dates back to 3300 BC, and historians generally set the cut-off for “Ancient Egyptian” at around 1300 BC. Younger phases include Demotic (spoken in around the 7th century BC) and Coptic (1st century BC). The latter remains the literary language of the Coptic Church and was still spoken in small parts of Egypt until the fifteenth century. Today, no one speaks an Egyptian language as a native tongue.

While the Egyptian branch accounts for zero modern languages, the Chadic branch constitutes the majority of Afro-Asiatic languages spoken today. Linguists didn’t recognize these languages as Afro-Asiatic until the mid-twentieth century, making it the latest addition to the family. As with all non-Semitic branches, the lack of a continuous historical record prevents us from studying its history in any depth. Scholars estimate that this branch has been separated from the others for about 5,000 years, and this separation poses both positives and negatives for curious scholars. On the positive side, more isolated versions of the language often maintain archaic grammar and pronunciation, helping us piece together the original language. On the negative, this isolation means that the Chadic languages borrowed from non-Afro-Asiatic languages. This makes it harder to differentiate which parts date back to Proto-Afro-Asiatic and which ones stem from a neighboring proto-language.

How do we know all these languages are related? The short and simple answer is that they feature tons of similarities in grammar and vocabulary, and these similarities can’t be explained by borrowing. Contrast this with the Pama-Nygungan languages Although they share many features, it’s not clear whether these similarities stem from a common ancestry or borrowing. For the Afro-Asiatic Family, there’s no mystery. While the Ethio-Semtic and Cushitic languages took from each other, there was no great cultural horizon that included speakers of Egyptian, Berber, Cushitic, and Chadic languages. Only a common ancestry can explain this resemblance. As always, I like to make a comparison with English. Parallelisms between the Germanic and Italic languages could theoretically be explained by borrowing. Borrowing cannot, however, explain the similarities between the Germanic and Indo-Iranian languages. Cyrus the Great didn’t hang out with the Norsemen.

If you want specifics, I can quote some pieces for the encyclopedia. I’m not going to pretend I’ve performed any original research here.

Morphological and lexical correspondences nevertheless support the inclusion of Chadic in the Afro-Asiatic language family, as well as e.g. the morphosyntactic suggestion, made by Jungraithmayr in 2007, that the Chadic subjunctive marker -u or -o is related to the
[…]
The first comprehensive study was written in 1892 by Erman, who dealt with phonetics, stem formation, morphology, syntax, lexical correspondences. Comparisons of vocabulary and phonetic relations were pushed further by Aaron Ember in 1930 and by Franz von Calice in 1936, who nevertheless offered a critical evaluation of several etymologies.
[…]
The casus agens is used for the logical subject of transitive verbs and for the instrumental, while the casus patiens marks the logical object of transitive and the logical subject of intransitive verbs. Traces of ergativity appear in ancient Egyptian, in Chadic languages, in Libyco-Berber dialects

That last one reads like it was written in a Chadic language, so I’ll offer some explanation. All languages contain transitive verbs (those with an object) and intransitive ones (those without). For example, “Klaus writes articles” uses a transitive verb, while “Klaus writes” deploys an intransitive one. Ergative languages don’t differentiate between objects and intransitive subjects in their case systems.

The Search for the Afro-Asiatic Homeland

Simplify readers know that the Indo-European language family descends from the Eurasian Steppes around 6,000 years ago, while the Uralic languages hail from the eastern portion of the Ural mountains about 6,000 to 10,000 years ago. The Afro-Asiatic Language family lacks a similarly clear origin story.

The traditional narrative posits that the Afro-Asiatic family began in the Near East and spread into Africa via the agriculturalist migration. This resembles an early Indo-European theory, which hypothesized that the language family spread via Anatolian farmers. Similar theories exist for many confirmed or alleged language families. The agrarian theory offers another commonality with its Indo-European counterpart: it’s wrong. Shared Afro-Asiatic root words indicate a pre-agrarian society, and the archeology record lacks the abrupt changes that one would expect from an agricultural migration.

Modern scholarship, then, points toward a homeland that’s more Afro and less Asiatic. Some have suggested a homeland in north-east Africa. According to this theory, Red Sea-adjacent developed innovations in wild-grain foraging. These techniques spread to the Horn of Africa and the Near East before those regions developed agriculture. Genetic evidence tells a Horn of Africa origin story. This suggests that the Afro-Asiatic language originated with a population similar to modern Ethio-Somalis. First, some of these people migrated from the Horn of Africa into the Levant. Later migration occurred in the opposite direction, bringing Semitic languages into the Horn the Africa. Finally, others place in the homeland in the Eastern Sahara, noting that the ice-age climate would have made this area ripe for foraging.

Each theory implies an advanced age for the Afro-Asiatic Language Family. The Saharan hypothesis suggests age in the 10,000-12,000 year range, while the Red Sea puts that figure somewhere between 9,000 and 15,000. Other hypotheses place the date as far back as 17,000 years ago. Since the language family originated among pre-agrarian people, scholars lack the quality of archeology evidence that they’ll find for younger language families. Advances in genetic research could resolve some of these disagreements, but it seems unlikely that we will ever reach the scholarly consensus that we find for the Indo-European language family.

Simplify

Please Stop Publishing "Oldest Languages in the World" Articles

Inside, find the true (and deeply unsatisfying) answer to the "oldest language" question

Is Proto-Pama-Nygungan the Oldest Language?

The Left Arm Problem

3. Greek – 1450 BC (circa. 3500 years old)

7. Farsi – 522 BC (circa. 2500 years old)

The Present

The Search for the Afro-Asiatic Homeland

Discussion about this post