Blogs
Ashok Guha
Jan 28, 2023, 06:22 PM | Updated 06:22 PM IST
Save & read from anywhere!
Bookmark stories for easy access on any device or the Swarajya app.
Did Sanskrit, the modern North Indian languages and the other Indo-European languages all originate in ancient migrations from the Pontic steppes of Ukraine and Southern Russia to India and elsewhere?
Or is this theory – and the counter to it – products of racism, ignorance and fraud?
Linguistic, archaeological and textual research have yet to produce a consensus. Recently, modern genetics has been added to the tool-kit, and this, it seems, has finally resolved the Aryan problem. Here, we enquire whether genetics has indeed illuminated this vexed question.
The Crucial Issue
The Aryan migration debate is not about whether the steppe people migrated into India or not. No one doubts the occurrence of innumerable migrations into India from innumerable sources, including the steppes. The question really is about the timing of the steppe migrations and whether Sanskrit and the Rigvedic culture were part of the baggage of these migrants.
The Aryan Migration Theory AMT (the new edition of the older Aryan Invasion Theory AIT) postulates that the Harappans were pre-Aryan with a culture in terminal decline by 1900 BC well before the Aryans entered India.
The Aryans must also have appeared on the scene well before the onset of the Iron Age around 1200 BC, as attested by their Bronze Age artifacts and by the Rig-Veda, which they are supposed to have composed in India.
The Iron Age began earlier in India than elsewhere, and even if the evidence of ferrous metallurgy around 1800 BC remains confined to the Ganga Basin and the South, it is improbable that it did not reach Northern India 600 years later. Thus, the outer limits to the Aryan migration are 1900 BC and 1200 BC.
The attribution of the Rig-Veda to the Aryan immigrants imposes even stricter time-limits.
The Rig Veda is an Indian book. Its geography and ecology are entirely Indian, with rivers, mountains, flora and fauna existent only here, definitely not on the steppes or anywhere along possible routes from the steppes.
It could have been composed only after the Aryans were established in India long enough to have quite forgotten any ancestral homeland or alien environment.
Its composition further must have occupied several centuries because its earlier books differ linguistically enough from the later for this evolution to have required a very long time.
Since the Rig-Veda must have been completed by 1200 BC, the Aryans must have immigrated at least 300 years earlier. Thus, the Aryan migration must have occurred within the narrow window of time 1900-1500 BC.
The Genetic Evidence
What is the genetic evidence of steppe immigration in this period?
The ancient DNA discovered in India proper is limited to that extracted from a single female buried around 2600 BC at Rakhigarhi on the banks of the ancient Drishadvati.
This woman’s DNA resembles that of 11 roughly contemporary individuals disinterred not in India, but well outside, 8 at Shahr-i-Sokhta in Iran and 3 at Gonur in the Bactria-Margiana Archaeological Complex (BMAC).
The geneticists Reich et al speculate that these 11 were travellers from the Indus-Saraswati region and that all 12 represent the genotype of the Harappans. None of the 12 show any traces of steppe lineage; so if they were Harappans, the latter must have differed sharply from the steppe pastoralists.
After the Rakhigarhi woman, the cupboard of ancient Indian DNA is bare for almost another 1500 years. The next bit of DNA harvested is from the Swat valley of Northwesternmost Pakistan where several individuals have been exhumed with radiocarbon burial dates ranging between 1000 BC and 800 BC, a few before 1000 BC and a couple soon after1200 BC.
These reflect the genetic profiles of the 12 supposed Harappans but also elements of a Steppe Pastoralist ancestry.
The chronology doesn’t quite fit the interval requirements (1900 BC – 1500 BC) of the AMT.
However, if the steppe people entered the Swat valley between 1900 BC and 1500 BC and interbred with Harappan natives to create a ‘ghost population’ now untraceable, and if a generation is 28 years long, then random recombination of genes within this ghost population over the intervening generations would create a replica of the disinterred Swat valley population at the time it actually lived and died.
Does this allay the discomfort generated by the paucity of ancient DNA from the relevant time and place? Not quite.
So the geneticists have marshalled an impressive array of DNA data from thousands of contemporary Indians.
They have then extrapolated backwards to compute how many generations of random recombination would yield an ancestral genetic profile such as might have resulted from a presumed encounter of steppe invaders with Harappans.
Assume appropriate generation intervals, and we can determine the actual chronology of this encounter. To nobody’s surprise, this turns out to be 1900 – 1500 BC.
This, in essence, is the genetic argument for the AIT.
Convincing? Well, worthy of a closer look.
V. Narasimhan, lead author of the definitive paper on India’s population genetics, asserts that steppe pastoralists account for 20-30 per cent of our ancestry and that such a large contribution could not possibly have come from thin trickles of migrants gradually filtering into the country, as the Aryan Migration Theory currently claims.
It would have required a massive influx, much population replacement with the original inhabitants driven South and East.
Indeed, the invaders were primarily males who mated with local women and eliminated, in one way or another, the local men.
As an anonymous writer in The Economist proclaims in triumphal tones, “The Aryans did not come from India. They conquered it.”
The Aryan Invasion Theory is back in its earlier bloodier incarnation and solid scientific evidence now confirms it.
Or does it? Here are some misgivings.
The Exclusion Test
First, the ghost population conjured up by the Harvard geneticists as ancestral to the Swat Valley skeletons is visualized as the product of a steppe invasion into an earlier resident population of Harappan relatives.
Such an interpretation would be legitimate if and only if we could rule out (‘exclude’) all alternative stories that yield the same results.
Can we?
Consider for example the following scenario: the steppe pastoralists had settled the Swat valley centuries earlier, but between 1900 BC and 1500 BC, they were invaded by refugees from the South, possibly escaping the desiccation of the Indus valley and the collapse of its economy.
The genetic consequences would be the same, but the story they tell would be the diametrical opposite: the Out-of-India Theory (OIT), rather than the AIT. Not that this would establish the OIT or anything else. Both interpretations of this exercise would fail what econometricians describe as ‘the exclusion test’.
Moreover, the Swat valley data display another strange feature. Steppe representation in the lineage of the ancient Swat valley is predominantly female-biased. This contrasts sharply with steppe ancestry in modern Indians which is overwhelmingly male-biased.
Narasimhan, Reich et al explain the latter by assuming that the migrants from the steppe into India proper were overwhelmingly male: they married local women and displaced or eliminated local men.
But if there was a migration from the steppe into the Swat valley, was it the steppe women who primarily migrated? Did they then proceed amazon-like to evict the local women and capture their men?
Surely, a more plausible scenario emerges if we apply symmetrically the logic Narasimhan, Reich et al use for their all-India data: the steppe folk were the earlier settlers in the Swat valley, the Harappan men were the immigrants of 1900-1500 BC in this region, the people who expelled the steppe men and married their women: again the OIT story, not the AIT.
A Robustness Check
What about the backward extrapolation of modern Indian DNA in order to date the collision of the early Indians with the steppe invaders?
The genetic device here is the use of the molecular clock to compute the number of generations between the past event and the present and then to assume a specific length of each generation to convert the molecular interval into calendar time.
Since well over a hundred generations are involved, the results are highly sensitive to the choice of the assumed generation interval.
Different studies of this variety the world over have assumed generation lengths that have varied from 20 to 37 years.
Most recent studies propose an interval between 25 and 30.
However, there are reasons to think that this represents an overestimate, especially in the case of India.
The molecular clock is set off by mutations that occur at a fixed rate roughly proportional to the number of births. The impact that a particular family makes on the probability of a mutation depends only on the number of children it has – and the average age at which it makes this impact is best represented by its age at the birth of the middle child.
But almost all studies use the mean age of child birth instead of the median, thus giving an undue weight to the children of our old age and biasing upward our estimates of the generation interval.
Indeed, it has been argued that the upward bias in these estimates is even larger than this implies, that in fact the earlier children should have larger weight in our calculations since they are likelier to transmit their traits to later generations.
All these are considerations for all places and times, but in India, they have been reinforced by the very high incidence of famines (due to the extreme variability of her monsoons), epidemics (due to the high density of her population), wars, massacres and intense poverty.
High death rates have been universal, not just of children, but of adults as well. Very few Indians live out their reproductive life spans. Child marriage, early birth and early death are demographic constants; the molecular clock ticks faster in India, reducing generation intervals.
What are the consequences of all this for the calculations of Narasimhan, Reich et al?
While it is not absolutely clear from their report, they seem to have a strong preference for a generation gap of 29 years. Given this generation interval, the DNA of their present day Indian samples would be consistent with a steppe invasion between 1900 BC and 1500 BC.
This would indeed be compatible with – though it would not, by itself, confirm – the AIT . However, that depends on the legitimacy of the assumption about the duration of a generation.
If the generation interval is 25 years rather than 29, backward extrapolation would suggest an invasion period between 1350 and 1000 BC; if it is 21 years, the relevant period would be 800 BC to 500 BC. Both would be far too late to be compatible with the rest of the AIT story.
Suppose, on the other hand, that the mean generation interval is 32 years (as assumed in many autosomal studies) rather than 29: then the Narasimhan-Reich data would imply an Aryan invasion between 2300 BC and 1900 BC, right smack in the middle of the mature and late Harappan periods, in flat contradiction with the AIT claim that the Aryans entered India only after the decline of the Indus Valley.
The AIT is simply not robust enough to withstand a change in any one of its assumptions.
The Steppe Ancestry of Indians: Alternative Explanations?
But if there was no massive influx from the steppe in the relevant period, how does one account for the huge share of the steppe genome in the ancestry of modern Indians?
Narasimhan, Reich et al reject the hypothesis that this may have been contributed by later invaders. They argue that these later invaders interacted with East Asians before they entered India and so their genetic heritage would have contained a far larger East Asian component than is revealed by the modern Indians sampled by the researchers.
This is a strange argument. The recorded history of invasions into India begins in 535 BC with Cyrus the Great whose Achaemenid armies (including not only Iranians but also Parthians, Scythians, Bactrians, Sogdians, Ionians and many others, all with Steppe ancestry) controlled NW India beyond the Jhelum for the next two centuries.
They were succeeded by the Greeks, the Kushans, the Shakas and the Huns in a continuous procession of invaders with Steppe ancestry who dominated vast swathes of India, from Gandhara and Taxila in the North to Ujjain in the South and from Saurashtra in the West to Mathura and Pataliputra in the East, for many more hundreds of years. This is a matter of recorded history, not of speculation like the AIT. How could such groups not leave a substantial genetic footprint?
Theory vs. Fact: What Could Be Wrong?
When one’s conclusions fly in the face of undisputed well-documented fact, one must reexamine either one’s data or one’s assumptions.
Let us give Narasimhan et al the benefit of the doubt and suppose that their sampling of the Indian population was unbiased.
The key assumption of their analysis is that invaders who entered India after the initial invasions of 1900-1500 BC interacted with East Asians earlier and therefore harboured more East Asian ancestry than is apparent in modern Indians.
“By the Late Bronze Age, East-Siberian-Hunter-Gatherer-related admixture (a proxy for East Asian lineage) became ubiquitous as documented by our time transect from Kazakhstan, and ancient DNA data from the Iron Age and from later periods in Turan and the central Steppe including Scythians, Sarmatians, Kushans, and Huns (25, 52). Thus, these 1st millennium BCE to 1st millennium CE archaeological cultures with documented cultural and political impacts on South Asia cannot be important sources for the Steppe pastoralist-related ancestry widespread in South Asia today (since present-day South Asians have too little East Asian-related ancestry to be consistent with deriving from these groups), providing an example of how genetic data can rule out scenarios that are plausible based on the archaeological and historical evidence alone. Instead, our analysis shows that the only plausible source for the Steppe ancestry is Steppe Middle to Late Bronze Age groups, who not only fit as a source for South Asia but who we also document as having spread into Turan and mixed with BMAC-related individuals at sites in Kazakhstan in this period. Taken together, these results identify a narrow time window (first half of the second millennium BCE) when the Steppe ancestry that is widespread today in South Asia must have arrived.”
Narasimhan, Reich et al recognize the indispensability of the assumption of the East Asian lineage of all potential Steppe-related invaders who may have entered India after 1500 BC. Without it, their theory fails the exclusion test: we cannot rule out an alternative hypothesis, a hypothesis moreover that has plenty of archaeological, textual and epigraphic support.
But is this an assumption beyond dispute? Possibly the largest ethnic group in Iron Age Central Asia were the Scythians, a very heterogeneous people with lineages ranging from Northwest European and Anatolian in the West, to Iranian, Bactrian and South Asian in the South with some East Asian component in North East Kazakhstan. However, the East Asian element was far from all-pervasive in their genetic make-up.
The main East Asian ventures into the Central Asian steppe in the first millennium BC were those of the Mongolian Xiongnu, who, around 200 BC, expanded into Xinjiang and displaced the Yuechi (the ancestors of the Kushans) who in turn drove other groups Westward and Southward, possibly into India.
Obviously, this did not affect the multi-ethnic Persian armies who entered India much earlier and from the West. Nor could it have affected the Greeks. It certainly affected the Huns (who probably represented a mixture of the Xiongnu and the Yuechi) and the Kushans.
Its impact on more Westerly groups like the Scythians, who were fleeing the Yuechi, who in turn were fleeing the Xiongnu, was only at second remove. The more Western Shakas, those located close to the North western gateway to India, are unlikely to have ever seen a Xiongnu, leave alone to have mated with him and transmitted his DNA to their Indian descendants.
If East Asian lineage is negligible in modern Indians and if history establishes beyond doubt that steppe invaders (all supposedly with East Asian ancestors) conquered and dominated much of Northern India in the first millennium BC and for hundreds of years thereafter, these facts cannot be reconciled by simply asserting that the latter groups did not mix with the natives – which is what the geneticists would have us believe.
A much simpler explanation is that the invaders who actually occupied and ruled large parts of the country in this era did not harbour much East Asian ancestry. This is indeed what might have happened if the actual entrants came from the Westernmost part of an East-to-West gradient of declining East Asian ancestry.
Indeed, Unterlander et al is cited by Narasimhan, Reich et al as the major source for their claim that by the Iron Age or much earlier, East Asian genes became ubiquitous among the Scythians. But what do Unterlander et al actually say?
“It is evident from this Principal Components Analysis that ancestry of the Iron Age samples falls on a continuum between present-day west Eurasians and eastern non-Africans, which is in concordance with the mitochondrial haplogroup analyses. The eastern Scythians display nearly equal proportions of mitochondrial DNA lineages common in east and west Eurasia, whereas in the western Scythian groups, the frequency of lineages now common in east Eurasia is generally lower, even reaching zero in four samples of the initial Scythian phase of the eight to sixth century BCE, and reaches 18–26% during later periods (sixth to second century BCE).”
In other words, the Western Scythians displayed far less East Asian ancestry than the Eastern Scythians. In fact, right down to 600 BC, fully 900 years after the supposed steppe invasions of 1900-1500 BC, there were Steppe-related groups in India’s neighbourhood with no discernible East Asian lineage at all.
If these were the groups that actually penetrated India, they could well have transmitted Steppe genes to us without importing East Asian ancestry or the Sanskrit language.
How could Reich, Narasimhan, Moorjani et al refute this alternative hypothesis? Only by producing local DNA evidence of the East Asian ancestry of the Indian shakas (say) of that age.
Note that this evidence would have to be discovered in India. The DNA of Scythians who lived and died in Kazakhstan or Bactria would be irrelevant.
But suppose we do indeed discover that latecomers to India – those who entered the subcontinent after 1500 BC – or at least after 1200 BC – all had substantial East Asian ancestry. Suppose further that history records that these latecomers dominated very large parts of India for very long periods of time. We would then be confronted by the need to explain how all this East Asian genetic heritage evaporated during the transition to the present day.
On the other hand, if the DNA of the latecomers does not disclose significant East Asian lineage, how do we reject the alternative hypothesis that it was these latecomers who brought steppe genes into India but certainly not the Sanskrit language?
The protagonists of the AIT have worked themselves here into a double bind, an inescapable Catch 22.
Suppose however that the Steppe genome was imported into India not by the Scythians, Persians or Greeks. Suppose it was brought by the very groups that Narasimhan et al deem guilty, but only after 1000 BC – whereafter accelerated evolution due to shorter generation intervals resulted in the particular genetic profile of present day Indians.
The delayed arrival would mean that the invaders (or migrants) would bring Steppe genes, but certainly not the Sanskrit language. On the other hand, a later migration date would be consistent with the evidence of physical anthropology: as noted physical anthropologist Kenneth Kennedy sums up, “While discontinuities in physical types have certainly been found in South Asia, they are dated to the 4/5th millennium or the 1st millennium BC, respectively too early or too late to have any connection with the ‘Aryans’.”
Bactria and the AIT
A further issue arises from the rejection by Narasimhan et al of Bactria as a source of ancestry for our population. The Indo-Aryans, according to them, must have bypassed Bactria.
But Bactria has a vital role in all earlier models of the AIT as a staging post in the long march of the Aryans from the Pontic Steppe to India.
After losing the 10 earlier groups of the Indo-European language family, it was in Bactria that they are supposed to have paused.
It was here that they developed a unique common composite culture and close linguistic affinity – neither shared by the other 10 groups – before bifurcating (or trifurcating) into the Indo-Aryan and Iranian (and possibly Mitanni) language groups.
The disappearance of Bactria creates a huge gap in the AIT narrative.
It also requires us to find a new route for the supposed Aryan invasion, more Westerly than previously contemplated, since to the East of Bactria, the mountain barrier is quite impassable.
But if the Bronze Age invaders of the AIT could follow a Westerly route into India, there is no reason
why Iron Age invaders could not do so – which in turn argues for an origin of these invaders far to the West of the Kushans and the Huns, at the Westernmost point perhaps of an East-to-West gradient of decreasing East Asian ancestry.
There is no reason, in short, why the Iron Age invaders (apart from the Kushans and the Huns) would not have very low East Asian ancestry, low enough to qualify as possible ancestors of present day Indians – which indeed is what recorded history suggests.
Does the Steppe Ancestry of the Priests Prove the Steppe Origins of Sanskrit?
Another question concerns the link between steppe ancestry and present-day rank in the caste hierarchy.
Like much older work, the recent genetic studies confirm the fact that higher castes have more steppe heritage than lower castes. However, Narasimhan et al and a few other contemporary studies make a couple of new points
First, they claim that the specifically priestly castes (not just the upper castes) harboured more steppe lineage than the others, that these were the custodians of Sanskrit and the Vedas, and that this is strong confirmatory evidence that the steppe invaders brought Sanskrit and perhaps the Vedas as well to India. T
he priestly castes, according to them are the Brahmins and the Bhumihars.
The Bhumihars claim to be Brahmins but not to be priests. Occupationally, they are land-owners and occasional warriors and their claim to Brahminhood is hotly disputed by others.
There is even a tradition that they are actually Shudras who Sanskritized themselves, a tradition taken so seriously by the Census Bureau of India that the first two Censuses actually classed them as Shudras.
The Brahmins and the Bhumihars do have high levels of Steppe ancestry, but so do the Khatris and Ahirs of Haryana and Punjab, the Lohanas and Bhanushalis of Gujarat and several other groups. None of these are priests and none claim to be Brahmins or profess profound affection for Sanskrit or the Vedas. Yet the incidence of the R1a1 gene, the so-called signature gene of the steppe pastoralists, is far higher among them than among any Brahmin group except those from Bengal.
The Delayed Advent of the Caste System: the Mystery of the Dog at Night.
A second claim about the genetic origins of caste is that, after the Aryans entered India, many centuries of large-scale ethnic interaction and consequent genetic turmoil ensued. It is not clear how, over this millennium or more of genetic admixture, the priestly castes, whom Reich et al consider to be the guardians of Sanskrit and the Vedas, maintained their genetic and occupational identity – so that the priests of today are the descendants of the invading Aryan priests of 1900-1500 BC, as Reich et al seem to believe.
However, by the beginning of the Christian era, according to the geneticists, the upper caste elite had firmly established a caste hierarchy and imposed strict endogamy which made future intermixture well nigh impossible. Consider what this particular claim implies for the AIT.
The Aryans enter India by 1500 BC, bringing with them the Sanskrit language. By 1200 BC, they have composed the Rig Veda and lost all memory of an earlier homeland outside India and of the journey from that homeland to their new habitat. They have also established their dominion over all of Northern India.
They have displaced the earlier population (at least the male component of it), dominated it so completely that both the conquerors and the conquered forget the history of the conquest and those of the conquered who remain in the conquered territory forget their own language.
The Aryans also rename all rivers, mountains and every other geographical feature of the North Indian landscape in Sanskrit, thus obliterating every trace of pre-Aryan history, geography and language within their territory, as well as their own extra-territorial memories. All this had been achieved by 1200 BC, unless one dismisses the entire evidence of the Rig Veda.
Yet after accomplishing a conquest so uniquely complete, so unparalleled in world history, did they move to formalise their supremacy over the subject populations? No, they waited …and waited….and waited. They waited more than a millennium before establishing a rigid hierarchy of endogamous castes with themselves on top.
Why this endless wait? Why, as in the Sherlock Holmes story, did the dog not bark at night? Was it perhaps because it wasn’t there?
Isn’t it more probable that the early steppe migrants entered, not between 1900 and 1500 BC, but in small groups in the 1st millennium BC, found the Rig Veda already composed and proto-Sanskrit the lingua franca, that they were gradually assimilated into the local culture and slowly built up their numbers?
Perhaps nearer the end of the millennium, they were joined by more militant invaders who carved out conquest states and used their political-military power to enforce a caste system with themselves at the top as a means of coexisting with, and controlling, a larger local population, much as the white South Africans did in the 20th century.
Conclusion
None of this constitutes a decisive refutation of the AIT or a vindication of the OIT. But it does show that the geneticists have not proved anything definitive about the debate either. They have done great work in acquiring DNA evidence, but they need to produce an interpretation of this evidence that excludes alternative explanations, is independent of unproven assumptions and is internally fully consistent. As of the moment, they have not done so.
Currently, their entire evidence is limited to the single Rakhigarhi specimen, that too of a woman who of course could not any way have possessed or displayed the R1a1 gene. Everything else is speculative and dubious back-calculation. Possibly, as more ancient DNA evidence is unearthed, perhaps from the Sanauli skeletons, greater clarity will emerge. Meanwhile, the state of the debate remains as clouded as ever.
References
Hartosh Singh Bal, Interview with Geneticist Vagheesh Nzrzsimhan, The Caravan, 22 November, 2019.
Kenneth Kennedy, 1995. “Have Aryans been identified in the prehistoric skeletal record from South Asia?”, in George Erdosy, ed.: The Indo-Aryans of Ancient South Asia, p. 49-54.
Moorjani, Thangaraj, Patterson, Reich et al, Genetic Evidence for Recent Population Mixture in India, American Journal of Human Genetics. 2013 Sep 5; 93(3): 422–438.
Narasimhan, Patterson, Moorjani, Rai, Shinde, Thangaraj, Reich et al, The Formation of Human Populations in South and Central Asia, Science. 2019 Sep 6; 365(6457): eaat7487. doi: 10.1126/science.aat7487
Unterlander Palstra, Lazaridis, Pilipenko, Hofmanova et al, Ancestry and demography and descendants of Iron Age nomads of the Eurasian Steppe, Nature Communications. 2017; 8: 14615. Published online 2017 Mar 3. doi: 10.1038/ncomms14615
The author is Professor Emeritus in Economics at the School of international Studies, JNU. He has aPhd in economics from Harvard and has taught at Yale, UCLA Berkeley, Georgetown Syracuse and the universities of Colorado and Melbourne.