February 28, 2014

SW African Bantu matrilineages

Prolific researcher Chiara Barbieri has put online another interesting study on African genetics, this time about the Bantu populations of Southwestern and Central-Southern Africa (i.e. Namibia, Angola, Botswana and Zambia).

Chiara Barbieri et al., Migration and interaction in a contact zone: mtDNA variation among Bantu-speakers in southern Africa. bioRXiv 2014. Freely accessible (pre-pub) → LINK


Bantu speech communities expanded over large parts of sub-Saharan Africa within the last 4000-5000 years, reaching different parts of southern Africa 1200-2000 years ago. The Bantu languages subdivide in several major branches, with languages belonging to the Eastern and Western Bantu branches spreading over large parts of Central, Eastern, and Southern Africa. There is still debate whether this linguistic divide is correlated with a genetic distinction between Eastern and Western Bantu speakers. During their expansion, Bantu speakers would have come into contact with diverse local populations, such as the Khoisan hunter-gatherers and pastoralists of southern Africa, with whom they may have intermarried. In this study, we analyze complete mtDNA genome sequences from over 900 Bantu-speaking individuals from Angola, Zambia, Namibia and Botswana to investigate the demographic processes at play during the last stages of the Bantu expansion. Our results show that most of these Bantu-speaking populations are genetically very homogenous, with no genetic division between speakers of Eastern and Western Bantu languages. Most of the mtDNA diversity in our dataset is due to different degrees of admixture with autochthonous populations. Only the pastoralist Himba and Herero stand out due to high frequencies of particular L3f and L3d lineages; the latter are also found in the neighboring Damara, who speak a Khoisan language and were foragers and small-stock herders. In contrast, the close cultural and linguistic relatives of the Herero and Himba, the Kuvale, are genetically similar to other Bantu-speakers. Nevertheless, as demonstrated by resampling tests, the genetic divergence of Herero, Himba, and Kuvale is compatible with a common shared ancestry with high levels of drift and differential female admixture with local pre-Bantu populations.

Figure 1: Map showing the rough geographical location of populations, 
colored by linguistic affiliation. Abbreviations of population labels are 
as specified in Table 1.

In spite of the Bantu-centric approach of the study, which also has its merits, my greatest interest is rather in the less typically Bantu lineages, which speak of admixture with several pre-Bantu populations.

In this sense I find the following highlights:

Fig. S2 (annotated in green by Maju): CA plots based on haplogroup frequencies. Left: all the dataset, right: excluding outliers.

L3d and L3f founder effect:

The Himba and Herero, as well as the non-Bantu pastoralists Damara make one distinctive cluster defined by the high frequencies of haplogroup L3d, as well as L3f (not present among the Damara but found among the Kuvale). As discussed in the paper, the Himba and Herero may be related to the Kuvale of SW Angola but they have notable differential levels (or directionality) of aboriginal admixture. 

As both L3d and L3f are present in West and East Africa alike, it is interesting to track the specific subhaplogroups implicated in this founder effect, something done in fig. 4. 

The main L3d sublineage is L3d3a1, whose haplotype network shows a largely Khoisan centrality (not Damara) although this node is shared also by some unspecified "other Bantu". The Southern Africa specificity of L3d3a was already noticed in the past (see here). So it is very possible that we are before an aboriginal Southern African lineage, maybe arrived with the first Khoisan Neolithic (or whatever other ancient flow) rather than a Bantu-specific founder effect. 

The main L3f subhaplogroup is L3f1b4a, which seems more specifically Bantu, with a major branch concentrated among the Himba, Herero and Kuvale. This lineage is not found among the Damara in spite of the other strong affinity of this Khoisan population towards the Himba and Herero. L3f1b is found in Southern Africa, Kenya and Oman (per Bihar 2008), so we are probably before a distinctive East African element, not too likely to be genuinely Bantu but possibly just assimilated into Bantu ethnic identity. 

Even if both lineages converge in the Himba and Herero, they are almost certainly different inputs, one of Damara (herder Khoisan) origin and the other of Bantuized East African origin maybe.

L1b founder effect:

L1b is essentially a West African lineage concentrated in the Sahel area from Chad westwards (although L1b1a2 is from the Nile basin). A particularly high frequency population are the Fulani pastoralists, original from the Westernmost African plateaus, who ruled many kingdoms in West Africa between the collapse of the colonial rule by Morocco and the consolidation of the European conquest of the continent.

As this study does not dwell in sublineages, we cannot understand the most likely specific origins of it among several Southern African populations, specifically the pooled NE Zambians (13%) and the Fwe and Shanjo of SW Zambia (24-27%).

In any case it is a notorious founder effect, almost absent in other Bantus of the area (0-10%).

Typical L0d Khoisan admixture:

This element is concentrated in Botswana (~25%) and with highest frequencies in the SW Kgalagadi (53%). It is also important among the Kuvale of SW Angola (21%). Other Bantu populations in this dataset have frequencies under 10%, some even zero. The Damara have 13%.

We know from previous studies that it is also found at high frequencies among the Xosha of South Africa (L0d3).

While L3h appears marked in the graph, the lineage is in fact absent in all populations except at very low frequency among the Kuvale (2%), so it does not seem actually of any relevance. 

Less typical L0k around SW Zambia:

While L0k is generally considered an aboriginal Southern African lineage it has a much more northernly distribution than the more common and surely older L0d. Its area of greatest commonality seems to be SW Zambia (see here and here).

This study confirms this distribution:

Supplementary Figure S3[A]: Haplogroup frequencies of important haplogroups in the populations studied here. A: Haplogroups L0d and L0k.(...)
The size of the circles is proportional to the sample size.

High frequencies of L1c (Pygmy admixture marker) among Southern African Bantus:

An interesting element is the commonality of L1c, typical of Western Pygmies and some other populations from Gabon (possibly representative of the wider West-Central Africa jungle region, not too well studied otherwise), among almost all Bantu populations in this dataset. 

The exceptions are the Herero, Himba, Kgalagadi and Tswana (0%), as well as the NE Zambians (4%). All the rest have frequencies between 12% and 30%. Even the non-Bantu Damaras have 11% of it.

In my understanding this almost certainly implies a notable level of admixture with Western Pygmies of the Bantus from especially Angola and West Zambia. A phenomenon that may be widespread in Central-West Africa. 

It is notable however that at least many of the populations with the highest likely Khoisan admixture (in its various forms, discussed in the previous sections) have the lesser frequencies of L1c (Pygmy admixture). So to a great extent these two aboriginal influences in Bantu mtDNA seem mutually exclusive and were probably produced after settlement rather than "on the march". 

This in turn arises some interesting questions about the ethnic geography of Africa before the Bantu expansion. 

Update: I just noticed that Ethiohelix has parsed the haplogroups' frequency into a very helpful chartLINK.

See also:

Grotte Chauvet's Aurignacian dates strongly questioned

The famous rock art of the Cave of Lions (Grotte Chauvet, Ardèche) seems now not to be of such an early date as was claimed by Valladas et al. in 2001 but rather from the Gravettian and Solutrean periods, with more solid dates between 26,000 to 18,000 BP.

Jean Combier & Guy Jouve, New investigations into the cultural and stylistic identity of the Chauvet cave and its radiocarbon dating. L'Anthropologie 2014. Pay per view → LINK [doi:10.1016/j.anthro.2013.12.001]


The discovery of Chauvet cave, at Vallon-Pont-d’Arc (Ardèche), in 1994, was an important event for our knowledge of palaeolithic parietal art as a whole. Its painted and engraved figures, thanks to their number (425 graphic units), and their excellent state of preservation, provide a documentary thesaurus comparable to that of the greatest sites known, and far beyond what had already been found in the group of Rhône valley caves (Ardèche and Gard). But its study – when one places it in its natural regional, cultural and thematic framework – makes it impossible to see it as an isolated entity of astonishing precocity. This needs to be reconsidered, and the affinities that our research has brought to light are clearly incompatible with the very early age which has been attributed to it. And if one extends this examination to the whole of the Franco-Cantabrian domain, the conclusion is inescapable: although Chauvet cave displays some unique characteristics (like every decorated cave), it belongs to an evolved phase of parietal art that is far removed from the motifs of its origins (known from art on blocks and on shelter walls dated by stratigraphy to the Aurignacian, in France and Cantabrian Spain). The majority of its works are therefore to be placed, quite normally, within the framework of the well-defined artistic creations of the Gravettian and Solutrean. Moreover, this phase of the Middle Upper Palaeolithic (26,000–18,000) coincides with a particularly intensive and diversified local human occupation, unknown in earlier periods and far less dense afterwards in the Magdalenian. A detailed critique of the treatment of the samples subjected to AMS radiocarbon dating makes it impossible to retain the very early age (36,000 cal BP) attributed by some authors to the painted and engraved figures of Chauvet cave.

February 24, 2014

SW Iberian plaques from the Chalcolithic

A new study gives us the opportunity to learn about the mysterious SW Iberian plaques from the Chalcolithic period.

Daniel García Rivero & Daniel J. O'Brien, Phylogenetic Analysis Shows That Neolithic Slate Plaques from the Southwestern Iberian Peninsula Are Not Genealogical Recording Systems. PLoS ONE 2014. Open access LINK [doi:10.1371/journal.pone.0088296]


Prehistoric material culture proposed to be symbolic in nature has been the object of considerable archaeological work from diverse theoretical perspectives, yet rarely are methodological tools used to test the interpretations. The lack of testing is often justified by invoking the opinion that the slippery nature of past human symbolism cannot easily be tackled by the scientific method. One such case, from the southwestern Iberian Peninsula, involves engraved stone plaques from megalithic funerary monuments dating ca. 3,500–2,750 B.C. (calibrated age). One widely accepted proposal is that the plaques are ancient mnemonic devices that record genealogies. The analysis reported here demonstrates that this is not the case, even when the most supportive data and techniques are used. Rather, we suspect there was a common ideological background to the use of plaques that overlay the southwestern Iberian Peninsula, with little or no geographic patterning. This would entail a cultural system in which plaque design was based on a fundamental core idea, with a number of mutable and variable elements surrounding it.

Figure 1. Engraved plaques from the Iberian Peninsula.
a, Valencina de la Concepción, Sevilla, Spain (Museo Arqueológico de Sevilla [MAS]); b, S. Geraldo, Montemor-o-Novo, Évora, Portugal (Museo Nacional de Arqueologia de Portugal [MNAP]); c, Monsaraz, Reguengos de Monsaraz, Évora (MNAP); d, Mora, Évora (MNAP); e, Jabugo, Aracena, Huelva, Spain (MAS); f, Ciborro, Monte-o-Novo, Évora (MNAP); g, Marvão, Portalegre, Portugal (MNAP); h, Estremoz, Évora (MNAP); and I, Pavia, Mora, Évora (MNAP).

Rather than dwelling in the central discussion of the study, which is to empirically discard the genealogical hypothesis (for which it is surely best to read the paper as such), my main interest is to share this not often seldom discussed Chalcolithic phenomenon which is limited to SW Iberia (i.e. Southern Portugal and nearby areas of Spain). This study gives us the opportunity of not just knowing it but also contemplate its unity and diversity from a large number of specimens. 

Fig. 2 -  General design of the plaques.
The dates of the "plaque idols", as they are often known in the literature, range from c. 2650 to c. 2100 BCE[see note below], corresponding to the development of the first Iberian (and West European) civilizations (fortified towns) in the area, which began c. 2600 BCE, with two main centers around modern Lisbon (Zambujal) and Almería (Los Millares) but that also knew of other such towns especially in Southern Portugal. All that in the context of dolmenic Megalithism, with the introduction of new burial designs such as the tholos (beehive tomb) or the artificial cave, innovations that may have been restricted for some elites. 

Important note (update Feb 25): the dates given in the previous paragraph are uncalibrated (i.e. raw BP minus 1950). The calibrated dates are quite older: between c. 3500 and 2600 "actual years" BCE, as you can check in table 1. They still overlap with the known dates for Los Millares (c. 3200–2300 BCE) and its "Almeriense" precursor culture but less so with Zambujal (c. 2600-1300 BCE, subject to possible revisions). My apologies for the confusion.

The most dense area, and seemingly also the most diverse, for this kind of findings is the southern part of Évora district (Central Alentejo, near the Guadiana River, known as River Ana in Antiquity), a mostly flat country with some low hills (the highest peak in the district has 600 m.) and a scattered natural forestry of corks and holm oaks. It was once known as Portugal's "bread basket" and was surely of relevance in the Neolithic and Chalcolithic period, especially in relation with the development of the influential burial style of dolmens or cairns (known as mamoas in Portuguese), later partly replaced by tholoi.

Typical Alentejo landscape (CC by Alvesgaspar)

The plaques' phenomenon is anyhow found through all the Southern half of Portugal, with limited penetration into Spanish Extremadura. Another important region was the Lisbon Peninsula, which was almost certainly a more important civilization and geopolitical center, with notable urban development in this period and becoming a major center of Bell Beaker.

Its main city, Zambujal (Torres Vedras) still barely researched was connected to the Atlantic Ocean by a 10-14 km long marine branch that was silted (tsunami?) at the end of its occupation (end of Bronze Age?) Hence we are talking of a major city (for the standards of the time at least) which lasted for more than a thousand years and whose influence encompassed once at the very least much of Southwestern Europe (and, if we accept that it was at the origins of the Bell Beaker, then all Western Europe and parts of North Africa).

Ruins of Zambujal (source)
Reconstruction of the known area of Zambujal, possibly just an acropolis (source)
Figure 3. Character states used in the analysis.

Back to the plaques, I don't feel able to say anything about them that is not in the paper (read it and browse the many figures, please), except for one thing: some of the characteristics of certain plaques compare well with other "religious" iconography from the Southern Iberian Peninsula in Chalcolithic times.

For example plaque A in figure 1 clearly has the "oculado" (eyed) symbol found in many other artistic elements of the time and believed to represent some divinity and very likely representing the eyes of an owl (suspected to have been an ancient divinity or divine symbol in much of Europe, and found also in India).

"Oculado" symbol in a bowl from Los Millares (CC by José-Manuel Benito Álvarez)
An "oculado" idol (CC by Luis García (Zaqarbal))
Proto-Chorintian owl (public domain, credit: Jastrow)

Other plaques with a more defined head (plaque G in fig. 1, NK2 in fig. 3), remind also to the Millarense "cruciform" idols:

(CC Museo de Almería)
Diverse types of idols from Chalcolithic Iberia (source)
So I would think that all or at least many may well represent the same kind of divinity, possibly related to the origins of several more historical deities such as Athena (Greece) or Mari (Basque Country). 

February 17, 2014

Oldest Okinawan Paleolithic evidence of human presence

A human tooth accompanying a hoard of modified shells shaped as tools have been found in the Sakitari-do cave (Nanjo, Okinawa). They are dated to c. 20-23,000 years ago. They seem to be the first known evidence of human presence in the East Asian archipelago.

The Sakitari-do cave is just 1.5 km away from where the Minatogawa human remains were found, which are however of a somewhat more recent date (c. 18-16 Ka ago). 

Minatogawa 1 (source)

Source: The Asahi Shimbun (via Pileta).

February 16, 2014

Ancient DNA from Clovis culture is Native American (also Tianyuan affinity mystery)

Figure 4 | [c] (...) maximum likelihood tree. 
A recent study on the ancient DNA of human remains from Anzick (Montana, USA), dated to c. 12,500 calBP, confirms close ties to modern Native Americans, definitely discarding the far-fetched and outlandishly Eurocentric "Solutrean hypothesis" for the origins of Clovis culture (what pleases me greatly, I must admit).

While this fits well with the expectations (at least mine), there is some hidden data that has surprised me quite a bit: it sits at the bottom of a non-discussed formal test graph in which modern populations are compared with both Anzick and Tianyuan (c. 40,000 BP, North China). See below.

Morten Rasmussen et al., The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature 2014. Pay per viewLINK [doi:10.1038/nature13025]


Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 14C years before present (bp) (13,000 to 12,600 calendar years bp)1, 2. Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology3. However, both the origins and the genetic legacy of the people who manufactured Clovis tools remain under debate. It is generally believed that these people ultimately derived from Asia and were directly related to contemporary Native Americans2. An alternative, Solutrean, hypothesis posits that the Clovis predecessors emigrated from southwestern Europe during the Last Glacial Maximum4. Here we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705 ± 35 14C years bp (approximately 12,707–12,556 calendar years bp) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4× and show that the gene flow from the Siberian Upper Palaeolithic Mal’ta population5 into Native American ancestors is also shared by the Anzick-1 individual and thus happened before 12,600 years bp. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that predates the Anzick-1 individual.

Haploid DNA

The Y-DNA lineage of Anzick is Q1a2a1* (L54) to the exclusion of the common Native American subhaplogroup Q1a2a1a1 (M3). Among the modern compared sequences that of a Maya is the closest one.

The mtDNA belongs to the common Native American lineage D4h3a at its underived stage (root). 

For starters I must explain that these underived haplotypes can only be found within mtDNA and never in modern Y-DNA (common misconception) because this one accumulates mutations every single generation, while the much shorter mtDNA does only occasionally. Hypothetically we could find the exact ancestor of some modern Y-DNA haplogroup in ancient remains but that would be like finding the proverbial needle in the haystack. On the other hand, finding the underived stage in mtDNA, be it ancient or modern, does not mean that we are before a direct ancestor but just a non-mutated relative of her, who can be very distant in fact.

Autosomal DNA

In this aspect, the Anzick man shows clearly strongest affinities to Native Americans, followed at some distance by Siberian peoples, particularly those near the Bering Strait. 

Figure 2 | Genetic affinity of Anzick-1. a, Anzick-1 is most closely related to Native Americans. Heat map representing estimated outgroup f3-statistics for shared genetic history between the Anzick-1 individual and each of 143 contemporary human populations outside sub-Saharan Africa. (...)

However Anzick-1 shows clearly closer affinity to the aboriginal peoples of Meso, Central and South America (collectively labeled as SA) and less so to those of Canada and the American Arctic (labeled as NA). No data was available from the USA. 

This was pondered by the authors in several competing models of Native American ancestry:

Figure 3 | Simplified schematic of genetic models. Alternative models of the population history behind the closer shared ancestry of the Anzick-1 individual to Central and Southern American (SA) populations than Northern Native American (NA) populations; seemain text for further definition of populations. We find that the data are consistent with a simple tree-like model in which NA populations are historically basal to Anzick-1 and SA. We base this conclusion on two D-tests conducted on the Anzick-1 individual, NA and SA. We used Han Chinese as outgroup. a, We first tested the hypothesis that Anzick-1 is basal to both NA and SA populations using D(Han, Anzick-1; NA, SA). As in the results for each pairwise comparison between SA and NA populations (Extended Data Fig. 4), this hypothesis is rejected. b, Next, we tested D(Han, NA; Anzick-1, SA); if NA populations were a mixture of post-Anzick-1 and pre-Anzick-1 ancestry, we would expect to reject this topology. c, We found that a topology with NA populations basal to Anzick-1 and SA populations is consistent with the data. d, However, another alternative is that the Anzick-1 individual is from the time of the last common ancestral population of the Northern and Southern lineage, after which the Northern lineage received gene flow from a more basal lineage.

The most plausible model they believe is "c", in which Anzick-1 is close to the origin of the SA population, while NA diverged before him. However model "d" in which Anzick-1 is close to the overall Native American root but NA have received further inputs from a mystery population (presumably some Siberians, related to the Na-Dené and Inuit waves) is also consistent with the data. Choosing between both "consistent" models (or something in between) clearly requires further investigation. 

Tianyuan and East Asian origins

All the above is very much within expectations, although refreshingly clarifying. But there is something in the formal tests (extended data fig. 5) that is most unexpected (but not discussed in the paper). 

The formal f3 tests of ED-fig.5 a to e fall all within reasonable expectations. Maybe the most notable finding is that, after all, the pre-Inuit people of the Dorset culture (represented by the Saqqaq remains) left some legacy in Greenland, but they also show some extra affinity with several Siberian populations (notably the Naukan, Chukchi, Koryak and Yukaghir, in this order) before to any other Native Americans, including Aleuts). 

But the really striking stuff is in figs. f and g, where it becomes obvious that the Tianyuan remains of Northern China show not a tad of greater affinity to East Asians (nor to Native Americans) than to West Eurasians. Also two East Asian populations (Tujia and Oroqen) are considerably more distant than the bulk of East Asian peoples to Tianyuan but also to Aznick.

Extended Data Figure 5 | Outgroup f3-statistics contrasted for different combinations of populations. (...) f, g, Shared genetic history with Anzick-1 compared to shared genetic history with the 40,000-year-old Tianyuan individual from China.

This is very difficult to explain, more so as Tianyuan's mtDNA haplogroup B4'5 is part of the East Asian and Native American genetic pool, and the authors make no attempt to do it. 

The previous study by Qiaomei Fu et al. (open access) placed Tianyuan's autosomal DNA near the very root of Circum-Pacific populations (East Asians, Native Americans and Australasian Aborigines) but after divergence from West Eurasians:

From Qiaomei Fu 2013

They even had doubts about the position of Papuans (the only Australasian representation) in that tree, which they suspected an artifact of some sort.

Since I saw that graph (h/t to an anonymous commenter at Fennoscandian Ancestry) I am squeezing my brain trying to figure out a reasonable explanation, considering that the formal f3 test has almost certainly more weight than the ML tree made with an algorithm. 

My first tentative explanation would be to imagine a shared triple-branch origin for Tianyuan, East Asians and West Eurasians, maybe c. 60 Ka ago (it must have been before the colonization of West Eurasia), to the exclusion of other, maybe isolated, ancient populations, whose admixture with the ancestors of the Tujia, Oroqen and Melanesians (maybe via Austronesians?) causes those striking low affinity values for these.

This would be a similar mechanism to the one explaining lower Tianyuan (and generally all ancient Eurasian) affinity for Palestinians (incl. Negev Bedouins) and also the Makrani, who have some African admixture and (in the Palestinian case) also, most likely, residual inputs from the remains of the first Out-of-Africa episode in Arabia.

However to this day we have no idea of which could be those hypothetical ancient isolated populations of East Asia. In normal comparisons such as ADMIXTURE analysis the Tujia and Oroqen appear totally normal within their geographic context, but this may be an artifact of not doing enough runs to reach higher K values, according to the cross-validation test, much more likely to discern the actual realistic components. 

The matter certainly requires further research, which may well open new avenues for the understanding the genesis of Eurasian populations, particularly those from the East.

February 15, 2014

Neolithic peoples from Britain and Ireland ate a lot of dairies and nearly no fish

I just discussed again the genetic sweep that apparently has happened in Europe after the Neolithic strongly favoring the selection of alleles that allow the digestion of lactose (the sugar present in milk and often in other dairies) by adults. However our knowledge of ancient European genetics is probably not sufficient (nor that of lactose tolerance genetics) and in any case the question remains, where did those lactase persistence (LP) alleles come from if all ancient Neolithic remains test negative?

An interesting possibility is opened by another recent study, not at all genetic in nature but rather bio-archaeological:

Lucy J. E. Cramp et al., Immediate replacement of fishing with dairying by the earliest farmers of the northeast Atlantic archipelagos. Proceedings of the Royal Society B 2014. Open accessLINK [doi:10.1098/rspb.2013.2372]


The appearance of farming, from its inception in the Near East around 12 000 years ago, finally reached the northwestern extremes of Europe by the fourth millennium BC or shortly thereafter. Various models have been invoked to explain the Neolithization of northern Europe; however, resolving these different scenarios has proved problematic due to poor faunal preservation and the lack of specificity achievable for commonly applied proxies. Here, we present new multi-proxy evidence, which qualitatively and quantitatively maps subsistence change in the northeast Atlantic archipelagos from the Late Mesolithic into the Neolithic and beyond. A model involving significant retention of hunter–gatherer–fisher influences was tested against one of the dominant adoptions of farming using a novel suite of lipid biomarkers, including dihydroxy fatty acids, ω-(o-alkylphenyl)alkanoic acids and stable carbon isotope signatures of individual fatty acids preserved in cooking vessels. These new findings, together with archaeozoological and human skeletal collagen bulk stable carbon isotope proxies, unequivocally confirm rejection of marine resources by early farmers coinciding with the adoption of intensive dairy farming. This pattern of Neolithization contrasts markedly to that occurring contemporaneously in the Baltic, suggesting that geographically distinct ecological and cultural influences dictated the evolution of subsistence practices at this critical phase of European prehistory.

Not only fish consumption was pretty much abandoned in Britain and Ireland with the arrival of Neolithic (only recovered under Viking influence many millennia later) but the most striking fact is that it was replaced by milk as main source of proteins. 

This fact, considering that farmers studied in Central Europe and Iberia have systematically tested negative for lactase persistence, really opens an avenue for the possible origins of this nutritional adaptation because it is most unlikely that they were such notable dairy consumers without the corresponding digestive ability (even cheese may be harmful to lactose intolerant people unless it is aged, while yogurt was almost certainly not known yet in Europe). 

While the evidence comes from the Atlantic Islands, it is worth to notice that their chronologically late Neolithic has its origins in the much older agricultural cultures of NW France, another blank spot in the ancient DNA map of Europe. Nowadays NW France is high but not particularly high in this phenotype but SW France and Basques have among the highest LP scores (both phenotype and rs4988235(T) genotype) in Europe, together with the Atlantic Islands and Scandinavia. 

Then again it is worth recalling that one of the first areas where the rs4988235(T) allele is found is in the southern areas of the Basque Country, with clear signs of two different populations (one lactose tolerant and the other lactose intolerant) being still in the first stages of contact and mostly unmixed.

This leads us to the issue of Atlantic Megalithism (tightly associated to Atlantic Neolithic) and its still unsolved, but likely important, role in the conformation of the modern populations of Europe. 

Whatever the case the first farmers of the islands were heavy dairy consumers, although in Britain (but not in Ireland and Man) they eventually derived into heavy meat eaters later on:

Figure 1.
Prevalence of marine and dairy fats in prehistoric pottery determined from lipid residues. (af) Scatter plots show δ13C values determined from C16:0 and C18:0 fatty acids preserved in pottery from northern Britain (red circles), the Outer Hebrides (yellow circles) and the Northern Isles of Scotland (blue circles), dating to (a) Early Neolithic, (b) Mid/Secondary expansion Neolithic, (c) Late Neolithic, (d) Bronze Age, (e) Iron Age and (f) Viking/Norse. Star symbol indicates where aquatic biomarkers were also detected. Ellipses show 1 s.d. confidence ellipses from modern reference terrestrial species from the UK [19] and aquatic species from North Atlantic waters [13]. (gi) Maps show the frequency of dairy fats in residues from Neolithic pottery from (g) Early Neolithic, (h) the Middle Neolithic/Secondary expansion and (i) Late Neolithic. Additional data from isotopic analysis of residues from Neolithic southern Britain (n = 152) and Scotland (n = 104) are included [19,20].

The data of this study also suggests that the so much hyped high-meat "Paleolithic diet" is more of a Late Neolithic (Chalcolithic) thing, with the real hunter-gatherers of Europe being more into fish in fact.

Correction: I wrongly reported the main European lactase persistence SNP as rs13910*T, when it is in fact rs4988235(T) (already corrected in the text above) This was caused by the nomenclature used in the Sverrisdóttir paper, where it refers to it as -13910*T, which must be some other sort of naming convention. Thanks to Can for noticing.

New lactase persistence study rejects "calcium absorption" hypothesis

The "calcium absorption" hypothesis has been proposed as hypothetical mechanism to explain the apparent genetic sweep of lactose persistence alleles in Europe. According to this hypothesis, the possible role of milk in improved calcium absorption would counter the poor vitamin D synthesis in Northern Europe, preventing rickets.

However this hypothesis seems very weak, as I explained recently, notably because bone formation is only one of the various roles of vitamin D and it is probably much more crucial in correct brain development in childhood. Also there is another clear adaptation that actually solves the problem very well: whiter skin able to much more efficiently produce vitamin D in our bodies surfaces by mere exposition to sunlight, a trait that seems to have been increasingly favored after the Neolithic drop in fish consumption (the only actual nutritional source of vitamin D at relevant doses).

This new paper confirms my skepticism.

Oddný Ósk Sverrisdóttir et al., Direct estimates of natural selection in Iberia indicate calcium absorption was not the only driver of lactase persistence in Europe. MBE 2014. Pay per viewLINK  


Lactase persistence (LP) is a genetically determined trait whereby the enzyme lactase is expressed throughout adult life. Lactase is necessary for the digestion of lactose – the main carbohydrate in milk – and its production is down-regulated after the weaning period in most humans and all other mammals studied. Several sources of evidence indicate that LP has evolved independently, in different parts of the world over the last 10,000 years, and has been subject to strong natural selection in dairying populations. In Europeans LP is strongly associated with, and probably caused by, a single C to T mutation 13,910bp upstream of the lactase (LCT) gene (-13,910*T). Despite a considerable body of research, the reasons why LP should provide such a strong selective advantage remains poorly understood. In this study we examine one of the most widely cited hypotheses for selection on LP – that fresh milk consumption supplements the poor vitamin D and calcium status of northern Europe's early farmers (the calcium assimilation hypothesis). We do this by testing for natural selection on -13,910*T using ancient DNA data from the skeletal remains of eight late Neolithic Iberian individuals, whom we would not expect to have poor vitamin D and calcium status because of relatively high incident UVB-light levels. None of the 8 samples successfully typed in the study had the derived T-allele. In addition, we reanalyse published data from French Neolithic remains to both test for population continuity and further examine the evolution of LP in the region. Using simulations that accommodate genetic drift, natural selection, uncertainty in calibrated radiocarbon dates, and sampling error, we find that natural selection is still required to explain the observed increase in allele frequency. We conclude that the calcium assimilation hypothesis is insufficient to explain the spread of lactase persistence in Europe.

The study finds most likely that, most likely, there is population continuity between Neolithic farmers and modern local peoples in Northern Iberia and SE France. Technically they could only not reject this population continuity for all population parameters, but, considering that the same tests strongly reject it for Central Europe and Scandinavia, the most parsimonious conclusion is that some important population continuity does exist in SW Europe since Neolithic. In the words of the researchers:
It thus seems likely that population turnover since or shortly after the Neolithic transition has been less severe in southwestern Europe than in central or northern Europe.

However these ancient populations were lactose intolerant (rs4988235(C)) while modern ones in Northern Iberia are massively able to digest lactose (rs4988235(T)). This supports the theory of adaptive sweep for this allele. 

They suspect that the real reason behind the lactose persistence sweep is caused by basic nutritional reasons (calories and proteins) because milk may have been less subject to fluctuations in crops (traditionally cattle ate grass and not cereals, as happens in modern industrial production, while goats have even more varied natural food sources). In such circumstances episodic famines would have strongly favored lactose tolerant phenotypes, more so if lactose intolerant people would have drank milk or ate high-lactose dairies in desperation, causing them potentially deadly diarrhea.

This is not the same but fits well with my class structure hypothesis, outlined recently. The main reason why I favor this hypothesis is that this generalizing pattern should have affected farmers since very early in the Neolithic, even when they were still living in Asia or Greece, so it is very strange that the genetic sweep only appears since or after the Chalcolithic period, when a hierarchical class society is formed everywhere.

Correction: I wrongly reported the main European lactase persistence SNP as rs13910*T, when it is in fact rs4988235(T) (already corrected in the text above) This was caused by the nomenclature used in the Sverrisdóttir paper, where it refers to it as -13910*T, which must be some other sort of naming convention. Thanks to Can for noticing.

See also:

The oldest human footprints in Europe

By now I'm sure that the vast majority of readers of this blog, if not all, have already read the news about the Happisburg footprints, of almost one million years age, which are coincident with the earliest known dates for archaic human presence in Europe based on other archaeology (H. ergaster or antecessor) and extend their range quite further northwards. So I just want to post a reference, as the study is freely available online for all to read.

Nick Ashton et al., Hominin Footprints from Early Pleistocene Deposits at Happisburgh, UK. PLoS ONE 2014. Open accessLINK [doi:10.1371/journal.pone.0088329]


Investigations at Happisburgh, UK, have revealed the oldest known hominin footprint surface outside Africa at between ca. 1 million and 0.78 million years ago. The site has long been recognised for the preservation of sediments containing Early Pleistocene fauna and flora, but since 2005 has also yielded humanly made flint artefacts, extending the record of human occupation of northern Europe by at least 350,000 years. The sediments consist of sands, gravels and laminated silts laid down by a large river within the upper reaches of its estuary. In May 2013 extensive areas of the laminated sediments were exposed on the foreshore. On the surface of one of the laminated silt horizons a series of hollows was revealed in an area of ca. 12 m2. The surface was recorded using multi-image photogrammetry which showed that the hollows are distinctly elongated and the majority fall within the range of juvenile to adult hominin foot sizes. In many cases the arch and front/back of the foot can be identified and in one case the impression of toes can be seen. Using foot length to stature ratios, the hominins are estimated to have been between ca. 0.93 and 1.73 m in height, suggestive of a group of mixed ages. The orientation of the prints indicates movement in a southerly direction on mud-flats along the river edge. Early Pleistocene human fossils are extremely rare in Europe, with no evidence from the UK. The only known species in western Europe of a similar age is Homo antecessor, whose fossil remains have been found at Atapuerca, Spain. The foot sizes and estimated stature of the hominins from Happisburgh fall within the range derived from the fossil evidence of Homo antecessor.

Figure 8. Vertical image of Area A at Happisburgh.
a. Model of footprint surface generated from photogrammetric survey showing the 12 prints used in the metrical analyses of footprint size; b. Plot of length and width measurements of 12 prints showing possible individuals. Means and standard deviations for foot length and age for modern populations are also shown.

February 10, 2014

Neolithic and Chalcolithic demographics of Western and Northern Europe

Somehow I missed this important study on the Neolithic and Chalcolithic demographics of Europe, as inferred from the archaeological record (h/t Davidski):

Stephen Shennan et al., Regional population collapse followed initial agriculture booms in mid-Holocene Europe. Nature Communications 2013. Open accessLINK [doi:doi:10.1038/ncomms3486]


Following its initial arrival in SE Europe 8,500 years ago agriculture spread throughout the continent, changing food production and consumption patterns and increasing population densities. Here we show that, in contrast to the steady population growth usually assumed, the introduction of agriculture into Europe was followed by a boom-and-bust pattern in the density of regional populations. We demonstrate that summed calibrated radiocarbon date distributions and simulation can be used to test the significance of these demographic booms and busts in the context of uncertainty in the radiocarbon date calibration curve and archaeological sampling. We report these results for Central and Northwest Europe between 8,000 and 4,000 cal. BP and investigate the relationship between these patterns and climate. However, we find no evidence to support a relationship. Our results thus suggest that the demographic patterns may have arisen from endogenous causes, although this remains speculative.

The most interesting aspect is maybe that the (apparent) demographic changes are detailed for many regions of Europe, but first let's see the general outlook for the whole area surveyed (Western and Northern Europe, Iberia excluded):

Figure 2: SCDPD-inferred population density change 10,000–4,000 cal. BP using all radiocarbon dates in the western Europe database.
Colored arrows and their annotations are mine.

I decided that it was important to mark the main cultural episodes for reference.

1st Neolithic refers to Impressed-Cardium and Linear Band Pottery cultures, which arrived almost simultaneously to Germany and France (of the surveyed areas), although the Rhône-Languedoc Neolithic is a few centuries earlier than the arrow, which has been standardized to 7500 BP.

Atlantic Neolithic refers to the quite belated arrival of Neolithic to Britain, Ireland and Northern Europe (standardized at 6000 BP). This process was quickly followed and tightly associated with the widespread cultural phenomenon of Dolmenic Megalithism. It is most interesting that the main deviation from the pattern of regular growth concentrates in this period and is clearly positive.

Corded Ware culture (Indoeuropean consolidation in Central and Northern Europe) affected only to Germany and Denmark-Scania within the surveyed regions. It was followed by a more widespread subcultural phenomenon known as Bell Beaker, which almost invariably cases manifests within pre-existent locally rooted cultures. Neither seems to be correlated with demographic expansions in the general overview.

Now let's take a look at the regional graphs:

Figure 3: SCDPD-inferred population density change 8,000–4,000 cal. BP for each sub-region.
Colored arrows, excepted the blue ones (which mark the local first Neolithic), are mine and mark general pan-European initial chronologies (not local!) for Megalithism, Corded Ware and Bell Beaker in those regions where they had some clear influence.

Here we can appreciate that:

Atlantic Neolithic and its associated Megalithic phenomenon are clearly related to notable demographic expansions in Ireland, Scotland, South England, Denmark and Scania. Megalithic influence may also be associated with some more irregular growth in South and Central Germany but rather not in France nor West Germany. A contemporary weak and irregular growth in North Germany (Brandenburg, Mecklemburg and Schlewig-Holstein) may be correlated with Funnelbeaker (with roots in Denmark) and the first Kurgan development of Baalberge and successor cultures (with roots in Eastern Europe), which would eventually evolve into Corded Ware.

Corded Ware only seems related to clear demographic growth in Jutland (and less resolutely in Scania). Bell Beaker is only linked with clear demographic growth in Ireland (and much more weakly in South England and Central Germany), while elsewhere it is rather associated with decline.

For the exact extension of the various regions as defined for this study, see fig. 1 (map).

As provisional conclusion, it seems obvious to my eyes that the most important demographic growth processes were the various Neolithic cultures but that the Atlantic Neolithic (and associated Megalithism) was particularly dynamic. In contrast Indoeuropean-associated cultural phenomena had a much weaker impact, with some localized exceptions, and are generally associated with local demographic decline instead, at least judging from the archaeological record.

See also:

Belated update: a follow up study was published in 2014 studying other regions of North, West and Central Europe → Bell Beaker Blogger discussed it.

February 9, 2014

Italian haploid genetics (second round)

More than a year ago I commented (as much as I could) on the study of Italian haploid genetics by Francesca Brisighelli et al. Sadly the study was published with several major errors in the figures, making it impossible to get anything straight. 

I know directly from the lead author that the team has been trying since then to get the paper corrected but this correction was once and again delayed by apparent inefficiency of PLoS ONE's management, much to their frustration. Finally this week the correction has been published and the figures corrected.

So let's give this study another chance:

Francesca Brisighelli et al., Uniparental Markers of Contemporary Italian Population Reveals Details on Its Pre-Roman Heritage. PLoS ONE 2012 (formally corrected in February 2014). Open accessLINK [doi:10.1371/journal.pone.0050794]
Notice please that you have to read the formal correction in order to access the new figures, the wrong ones are still in the paper as such. 

The corrected figures are central to the study:

Figure 1 (corrected). Map showing the location of the samples analyzed in the present study and those collected from the literature (see Table 1).
Pie charts on the left display the distribution of mtDNA haplogroup frequencies, and those on the right the Y-chromosome haplogroup frequencies.

So now we know that the Northern mtDNA pie was duplicated in the original graph and that Central Italians are outstanding in R0(xH,V), which reaches 14% (probably most HV*), while they have some other peculiarities relative to their neighbors from North and South: some less U and no detected V. 

Other variations are more clinal: H decreases from North to South while J and T do the opposite.

Figure 3 (corrected). Phylogeny of Y-chromosome SNPs and haplogroup frequencies in different Italian populations.

In the Y-DNA side, the most obvious transition is between the high frequencies of R1b1a2-M269 (R1b3 in the paper) in the North versus much lower frequencies in the South. But also:
  • J2 is notorious in the Central region (and also the South) but rare in the North.
  • G frequencies in the South are double than those of Center and North.
  • The same happens with lesser intensity regarding E1b1b1-M35 (E3b in the study).
  • In contrast haplogroup I is most common in the North. However the Sardinian and sub-Pyrenean clade I2a1a-M26 (I1b2 in the paper), which is also the one documented in Chalcolithic Languedoc, is rare in all regions.

The study also deals with several isolated populations:

Figure 4. Haplogroup frequencies of Ladins, Grecani Salentini and Lucera compared to the rest of the Italian populations analyzed in the present study.

All them show large frequencies of mtDNA H relative to their regions. The Grecani Salentini do have some extra Y-DNA E1b1b1 (E3b) and J2, what may indeed underline their partial Greek origins. The Ladini show unusually high frequencies of R1b*(xR1b1a2) and K*(xR1a,R1b,L,T,N3), while the Lucerans are outstanding in their percentage of G.

I want to end this entry with a much needed scolding to the staff of PLoS ONE for their totally unacceptable original sloppiness and delay in the correction. And my personal thanks and appreciation to Francesca Brisighelli for her indefatigable persistence and enthusiasm for her work, which is no doubt of great interest.

February 7, 2014

Mitochondrial lineages from Myanmar

Myanmar, also known as Burma, has been one of those blind spots in the mapping of human genetics. Finally now we get to know something about the peoples of this SE Asian multiethnic state, although there are limitations because the sampling was performed among refugees in Thailand.

Monica Summerer et al., Large-scale mitochondrial DNA analysis in Southeast Asia reveals evolutionary effects of cultural isolation in the multi-ethnic population of Myanmar. BMC Evolutionary Biology 2014. Open accessLINK [doi:10.1186/1471-2148-14-17]



Myanmar is the largest country in mainland Southeast Asia with a population of 55 million people subdivided into more than 100 ethnic groups. Ruled by changing kingdoms and dynasties and lying on the trade route between India and China, Myanmar was influenced by numerous cultures. Since its independence from British occupation, tensions between the ruling Bamar and ethnic minorities increased.


Our aim was to search for genetic footprints of Myanmar’s geographic, historic and sociocultural characteristics and to contribute to the picture of human colonization by describing and dating of new mitochondrial DNA (mtDNA) haplogroups. Therefore, we sequenced the mtDNA control region of 327 unrelated donors and the complete mitochondrial genome of 44 selected individuals according to highest quality standards.


Phylogenetic analyses of the entire mtDNA genomes uncovered eight new haplogroups and three unclassified basal M-lineages. The multi-ethnic population and the complex history of Myanmar were reflected in its mtDNA heterogeneity. Population genetic analyses of Burmese control region sequences combined with population data from neighboring countries revealed that the Myanmar haplogroup distribution showed a typical Southeast Asian pattern, but also Northeast Asian and Indian influences. The population structure of the extraordinarily diverse Bamar differed from that of the Karen people who displayed signs of genetic isolation. Migration analyses indicated a considerable genetic exchange with an overall positive migration balance from Myanmar to neighboring countries. Age estimates of the newly described haplogroups point to the existence of evolutionary windows where climatic and cultural changes gave rise to mitochondrial haplogroup diversification in Asia.

The main sampled ethnic group are the Karen, who live at the border with Thailand, but the Bamar or Burmans, the largest ethnic group, were also sampled in big numbers. 

Fig. 2.- Origin of samples and mitochondrial haplogroup distribution of Southeast Asian populations. Although most of the study participants originated from Karen State (red), a broad sample spectrum from nearly all divisions and states of Myanmar (a) was included in this study. b shows the haplogroup distributions of populations from Myanmar and four other Southeast Asian regions. In the white insert box the haplogroup heterogeneity of two ethnic groups of Myanmar is illustrated. The hatched area in the map surrounding the border between Myanmar and Thailand shows the main population area of the Karen people. The Bamar represent the largest ethnic group (68%) in Myanmar. The size of the pie diagrams corresponds to sample size.

The smaller samples are only detailed in the supplementary data for what I have seen, so I will not discuss them right now (maybe in an update?). 

Overall all SE Asians including the Southern Han from Hong-Kong appear similar in broad terms. Excepted Laos, this relative similitude is quite apparent in figure 3:

Fig. 3.- Multi-dimensional scaling plot of pairwise Fst-values and haplogroup distribution of populations from Myanmar and 12 other Asian regions. A distinct geographic pattern appeared in the multi-dimensional scaling plot (Stress = 0.086; R2 = 0.970) of pairwise Fst-values: The Myanmar sample fitted very well within the Southeast Asian cluster, the Central Asian populations formed a second cluster, the Korean sample represented East Asia, the Afghanistan population was representative for South Asia and Russia symbolized Western Eurasia. The main haplogroup distributions are displayed as pie charts. The size of the pie diagrams corresponds to sample size. The proportion of N-lineages (without A,B and R9’F) increases from very low percentages in Southeast and East Asia over 50% in Central Asia to more than 75% in Afghanistan and 100% in the sample of Russian origin. The proportion of the American founding haplogroups A,B,C and D displayed an interesting pattern: from inexistent in Russians it increased to more than 50% in East Asian Korea.

Looking at the particular differences in haplogroup frequencies, I'd say that the Thai are quite unremarkable, while the other populations show some peculiarities:
  • Karen: higher frequencies of R9/F, A, C and G
  • Bamar: much higher M* (and extremely diverse)
  • Laotian: higher frequencies of B and M7
  • Vietnamese: more B and N*
  • South Han (Hong-Kong): more D

It is very notable the high diversity of paragroup M* among the Bamar. The authors notice that not more than three individuals shared each different subhaplogroup, what points to a very high diversity within haplogroup M. I don't have time right now to ponder the various lineages, some of which are newly described, but I probably will in the future, because, together with the high diversity in NE India, they have the potential of shifting the paradigm of Asian colonization by H. sapiens a bit towards the East.

The various M* and other novel haplogroups described in Myanmar is shown in fig. 4. Haplogroups M90 and M91 are new basal M sublineages, along with three other unnamed private lineages, which also appear as basal. Also M20a, M49a and G2b1a are new sublineages further downstream. Within N/R, another newly described lineage is B6a1.

The Bamar are extremely diverse not just within M*:
... the haplogroup composition of Bamar was exceptionally diverse with 80 different haplogroups and a maximum of 6 samples in the same haplogroup (Figure 4).

On the other hand, the Karen show the signs of genetic isolation instead, with large concentrations in the same haplogroups.

Interestingly, the authors think that rather than being a receiver, Myanmar was a major source of population to its neighbors:
Migration analyses of Myanmar and four Southeast Asian regions displayed a vivid exchange of genetic material between the countries and demonstrated a strong outwards migration of Myanmar to all analyzed neighboring regions (for details see Additional file 4: Table S4).

This influence is most intense to Laos, Thailand and South China, while things are more balanced regarding Vietnam instead.