Main

The peopling of Europe was marked by human population expansions and contractions associated with major climatic events. Numerous studies indicate a dramatic population contraction in Palaeolithic Europe during the Last Glacial Maximum (LGM, ~26.6–19 cal kyr bp (refs. 1,2,3,4)). Human presence in the archaeological record is documented predominantly by artifacts, mainly stone tools assigned to so-called technocomplexes, rather than by skeletal remains, which are rare in the Palaeolithic record.

With the onset of the LGM, a population decline is observed in central Europe and human populations associated with the Gravettian technologies (33–25 cal kyr bp) retreated to southern latitudes, to regions in today’s Italy and central/southeastern Europe1. In southwestern Europe, a singular Upper Palaeolithic (UP) technocomplex, the Solutrean, emerged in regions of today’s southern France and Iberia by ~24–19 cal kyr bp (refs. 5,6,7), which coincides in time with the intensive cold peak of the Heinrich 2 Event and following the LGM8. The Solutrean is defined by a suite of new lithic technologies and implements, with regionally distinct lithic point types9,10,11 interpreted as an adaptation in response to the hard climatic conditions1 and more generally as a breakdown of the Gravettian technologies. Some scholars have explained the cultural discontinuity by migratory processes, with putative origins in North Africa on the basis of parallels with Aterian lithic assemblages12,13,14. However, the prevailing consensus sees the Solutrean lithic tradition rooted in western European Late Gravettian technologies15,16,17, which had undergone cultural drift due to isolation from other groups and the disruption of extended pan-European networks, adaptation to harsh climatic conditions1 and demographic pressure18. Further support for a local development of the Solutrean was seen in the synchronous origin of the new lithic traditions that would culminate in the Solutrean in the French and Iberian territories16,19.

On the Iberian peninsula, the archaeological record of the Solutrean documents a dense peripheral dispersion on both the Atlantic and Mediterranean sides of the peninsula7,20, with occasional occupation of the inner plateau21. Solutrean traditions between the Mediterranean/southern Portugal and Cantabrian/Pyrenean regions were considered to be a consequence of territorialism which had followed population contractions and limited space22,23. In contrast to the preceding Gravettian and the subsequent Magdalenian, when northern Iberia was more densely occupied, the number of sites associated with the Solutrean is roughly equal in both regions, albeit being dispersed more widely in the south7,20 and suggests a network of interconnected groups within a limited perimeter.

The available genome-wide data from archaeological contexts older than the LGM in western Europe is scarce and does not yet allow a detailed study of the genomic transformation of the UP human groups of this part of the continent (Supplementary Table 1.1). The oldest genome-wide data published so far come from central and eastern Europe, dated back to ~45–40 cal kyr bp corresponding with when the Initial Upper Palaeolithic (IUP) technologies prevailed and the genotyped individuals show a wide variety of ancestry profiles and levels of Neanderthal admixture (for example, Bacho Kiro_IUP from Bulgaria24, Oase125, Muieri from Romania26 and Zlatý kůň from the Czech Republic27). Conversely, the oldest genome-wide data available from western Europe come from an Aurignacian-associated individual Goyet Q116-1 in today’s Belgium26. Gravettian-associated individuals from pre-LGM central and southern Europe form a genetic cluster (share more ancestry within the group than with individuals outside the group) (Fig. 1), which was named after the oldest individual, here Věstonice cluster26,28, irrespective of the Gravettian industries they were associated with archaeologically. However, so far, no genome-wide data have been published from western European Gravettian-associated individuals (Fig. 1). Pre-LGM Gravettian-associated groups from central Europe (Věstonice cluster) differ genetically from post-LGM Magdalenian-associated groups from both central and western Europe (which also form a genetic cluster coined Goyet Q2) (refs. 26,29), whereas the last had received genetic ancestry first found in an Aurignacian-associated individual Goyet Q116-1 from northwestern Europe. This genomic discontinuity between central European Gravettian-associated individuals and western-central European Magdalenian-associated individuals has been explained by the population contractions during LGM26 and supported by mitochondrial studies, which noted the disappearance of, for example, mitochondrial DNA haplogroup M during the LGM30.

Fig. 1: Chronological and geographical overview of newly reported and relevant published individuals.
figure 1

a, Geographical distribution of Pleistocene individuals with genome-wide data (>20,000 SNPs covered in the 1,240,000 SNP panel; coloured symbols, consistent with individuals and symbols on the y axis of b). b, Chronological distribution of Pleistocene individuals with genome data. The grey bar indicates the extent of the LGM (*Zlatý kůň is dated genetically to ~45 kyr bp) (ref. 27). c, Genetic overview of the western and central Europe UP and their correspondence with technocomplexes (where possible). Arrows with dashed lines show gaps in the genetic record. See Supplementary Table 1.1 for a detailed description of the individuals.

Following the Bølling/Allerød warming interstadial (14 cal kyr bp), the Goyet Q2 cluster was replaced by the Villabruna cluster in central Europe, named for its oldest Epigravettian-associated individual from northern Italy26, but which also includes most of the Epipalaeolithic- and Mesolithic-associated groups from central and western Europe, all of which are also known as western hunter-gatherers (WHG)31. In this genetic landscape, Iberian hunter-gatherers (HG) stood out as they retained higher proportions of the Goyet Q2-like ancestry during the Epipalaeolithic and Mesolithic periods and thus are often considered separate29.

Individuals from western Europe who directly date to the LGM period are essential to address the genetic discontinuity between pre-LGM and post-LGM groups described by ref. 26. To investigate the role of southern European refugia during the LGM, we generated genome-wide data from several Solutrean-associated human remains from Cueva del Malalmuerzo (Moclín, Granada, Spain) (Fig. 1). Cueva del Malalmuerzo is well known for its rock art paintings that are stylistically attributed to the Solutrean. Although Solutrean industries have been found in the cave32, so far there are no in situ stratigraphic layers directly associated with this technocomplex. The latest archaeological investigations of the cave uncovered several human remains in a small area, which corresponded to an old archaeological profile from previous excavations.

We sampled additional prehistoric human remains from various cave and rock shelter sites in Andalucia, Spain (Supplementary Information 1), with long occupation histories to establish a time transect in southern Iberia from the LGM to the Neolithic periods. After applying quality filters and radiocarbon dating, we were able to analyse one Solutrean-associated individual from Cueva del Malalmuerzo, two EN individuals from Cueva de Ardales and Las Aguilillas and two Chalcolithic (CA) individuals from Cueva de Ardales and Los Caserones. Individual ADS007 from Cueva de Ardales did not provide enough collagen for radiocarbon dating but enough genome-wide information to perform genetic analyses (Supplementary Table 1.2 and Supplementary Information 1). We present the contextualized genomic results in chronological order.

Results and discussion

To generate genome-wide data with maximum coverage, several DNA extracts and single-stranded non-uracil-DNA-glycosylase (UDG) libraries from each sample (Supplementary Tables 1.2 and 1.3) were prepared following established protocols33. Our final dataset ranged from 0.51× to 8.7× average coverage on targeted 1,240,000 single-nucleotide polymorphism (SNP) sites (Supplementary Table 1.2). After 1,240,000 SNP capture the merged libraries underwent a second round of quality control, applying a minimum SNP cutoff for robust ancient DNA authentication and contamination estimation (Supplementary Information 2, Supplementary Figs. 13 and Supplementary Tables 1.41.9).

The genomic make-up of Solutrean hunther-gatherers of Cueva del Malalmuerzo

Two human teeth were recovered during an archaeological survey of Cueva del Malalmuerzo (MLZ): MLZ003 (arch ID, MALM16 SUP2.1; tooth 34) and MLZ005 (arch ID, MALM16 Sector A 8.2; tooth 33). Samples MLZ003 and MLZ005 were found to be contemporaneous and radiocarbon dated to a period when the Solutrean technocomplex prevailed (MLZ003, 23,016–22,625 cal yr bp; MLZ005, 22,979–22,570 cal yr bp), concordant with the stylistic rock art found in the cave34 (Supplementary Table 1.2) and thus present the oldest genomic data from UP human remains in Iberia. We found that both teeth belong to the same individual and thus merged the data (MLZ003005 or MLZ henceforth) for downstream population genetics analyses (Supplementary Information 2, Supplementary Fig. 2a,b and Supplementary Table 1.6). The final average coverage on targeted SNP sites was 0.41×, which corresponds to 226,914 autosomal SNPs in the 1,240,000 panel (Supplementary Table 1.2).

MLZ carries mitochondrial DNA-haplogroup U2'3'4'7'8’9 (Supplementary Tables 1.10 and 1.11). The oldest individual carrying the derived mtDNA-haplogroup U2 is Kostenki14 (~38 cal kyr bp, Russia)35,36, while the more basal mtDNA-haplogroup U2'3'4'7'8’9 is restricted to individuals in southwestern Europe, with Paglicci108 being the oldest Palaeolithic individual (~27.8 cal kyr bp, Italy)26, followed by Rigney (~15.5 cal kyr bp, France)26, Oriente_C (~14 cal kyr bp, Sicily)37, Grotta dell’Uzzo (~10 cal kyr bp, Sicily)38 and Balma Guilanyà (~13 cal kyr bp, Spain)29. The geographic distribution of U2'3'4'7'8’9 is consistent with an early spread of human groups into western Europe and was suggested to have survived the LGM in the Iberian and Apennine refugia30.

MLZ carried Y chromosome haplogroup C1 (Supplementary Table 1.2), which was also found in individuals from Bacho Kiro IUP (~45 cal kyr bp, Bulgaria). The more basal Y-haplogroup C was found in Paglicci_133 (~33 cal kyr bp, Italy), Cioclovina_1 (~32 cal kyr bp, Romania) and Kostenki_12 (C, ~32 cal kyr bp, Russia)26 and the derived Y-haplogroup C1a2 was found in Goyet Q116-1 (~35 cal kyr bp, Belgium)26 and Sunghir (~34 cal kyr bp, Russia)39.

To characterize the genomic profile of MLZ, we estimated genetic similarities among all published Palaeolithic and Mesolithic HGs including the new data using f3-outgroup statistics of the form f3(HG1, HG2; Mbuti). In the resulting heatmap (Supplementary Information 3 and Supplementary Fig. 4), MLZ clusters with the later Magdalenian-associated individuals from the Goyet Q2 cluster and some Epipalaeolithic and Mesolithic HGs from Iberia. These results suggest a genetic ancestry that is similar, or related to, the one found to be characteristic for Magdalenian-associated individuals26 and present in an admixed form in Iberian HGs29,40.

Multidimensional scaling (MDS) of the transformed pairwise-distance f3-matrix (1 − f3, Fig. 2a) shows that MLZ falls outside the genetic variation of the preceding central European Gravettian-associated individuals (Věstonice cluster26). Interestingly, MLZ falls between the Aurignacian-associated Goyet Q116-1 and the Magdalenian-associated individuals from the Goyet Q2 cluster, to the exclusion of El Mirón, who falls within the Iberian HG cline that bridges WHG- and Goyet Q2-like ancestries29. We then calculated f4-statistics of the form f4(GoyetQ2 cluster, GoyetQ116-1; MLZ, Mbuti) to test whether Goyet Q116-1 and Magdalenian-associated individuals are cladal with respect to MLZ (Fig. 2b and Supplementary Table 2.1). We find that MLZ shares more genetic drift with Magdalenian-associated individuals than with Goyet Q116-1. However, when testing whether Magdalenian-associated individuals and MLZ are symmetrically related to Goyet Q116-1 using the contrasting f4(MLZ, GoyetQ2 cluster; GoyetQ116-1, Mbuti), we observe an excess of shared drift between MLZ and Goyet Q116-1, for example when HohleFels49, Goyet Q2 and El Mirón are used as proxies for Magdalenian-associated ancestry (Fig. 2c and Supplementary Table 2.2). These results suggest that MLZ represents a lineage that is genetically intermediate between Goyet Q116-1 and individuals from the Goyet Q2 cluster. In line with the chronology, Goyet Q116-1 is more genetically similar to MLZ than to the Goyet Q2 cluster, whereas MLZ is genetically more similar to the Goyet Q2 cluster than to the preceding Goyet Q116-1 (Fig. 2b,c). Identifying MLZ as a member of a lineage that contributed genetically to Magdalenian-associated individuals is consistent with the archaeological record that postulates an emergence of the Magdalenian technocomplex in regions of northern Iberia and southern France41,42.

Fig. 2: Genetic affinity of the MLZ individual and genetic structure among HGs.
figure 2

a, MDS plot of the pairwise f3-matrix of the form f3(HG1, HG2; Mbuti) (Supplementary Fig. 1) transformed into distances using 1 − f3. The main genetic clusters mentioned in the paper are highlighted here. b, The f4-statistics of the form f4(GoyetQ2 cluster, GoyetQ116-1; MLZ, Mbuti) show a significant affinity between Magdalenian-associated individuals and MLZ when compared to Goyet Q116-1 (Supplementary Table 2.1). c, The f4-statistics of the form f4(MLZ, GoyetQ2 cluster; GoyetQ116-1, Mbuti) show that MLZ and Magdalenian-associated individuals are not symmetrically related to Goyet Q116-1 and that MLZ shares more genetic drift with Goyet Q116-1 (Supplementary Table 2.2). d, The f4-symmetry tests of the form f4(MLZ, GoyetQ2 cluster; Věstonice16, Mbuti), where Věstonice 16 is symmetrically related to MLZ and Magdalenian-associated individuals. e, The f4-symmetry tests of the form f4(MLZ, GoyetQ116-1; Věstonice cluster, Mbuti), including other central European, Gravettian-associated individuals, who are symmetrically related to MLZ and Goyet Q116-1. Both tests (d and e) do not deviate from 0 and thus show that there is no excess of shared drift between MLZ/GoyetQ116-1 and central European, Gravettian-associated individuals (Supplementary Tables 2.3 and 2.4). For all f4-statistics, error bars indicate ±3 s.e. and were calculated using a weighted block jackknife83 across all autosomes on the 1,240,000 panel (nSNPs = 1,150,639) and a block size of 5 megabases (Mb); | Z | > 3 points with thicker outline.

We then explored whether MLZ and Aurignacian-associated Goyet Q116-1 or the later Goyet Q2 cluster were symmetrically related with respect to the Věstonice cluster using f4-statistics of the form f4(MLZ, GoyetQ2 cluster/GoyetQ116-1; Věstonice cluster, Mbuti) (Fig. 2d,e and Tables 2.3 and 2.4) but observed no excess shared drift with Věstonice cluster individuals. This means that the genetic discontinuity between pre-LGM Věstonice cluster and post-LGM GoyetQ2 cluster, as reported by ref. 26, was not driven by the harsh climatic change, as the differentiation is already visible in MLZ, who directly dates to the height of the LGM. This implies that at least two genetically distinct groups must have existed in Europe when the Gravettian technocomplex prevailed: one in western Europe, represented by Goyet Q116-1 and a second in central (and perhaps eastern) Europe, described as the Věstonice cluster26. Our results match technological studies which suggest that the Solutrean was rooted in western Gravettian technologies15,16,19 and the resemblance of the rock art associated with the Gravettian and Solutrean in western Europe43. By contrast, this result renders a monocentric central European origin of the Gravettian unlikely44.

Given that the Solutrean is restricted to southern France and Iberia and assuming that southwestern Europe was a geographical refugium for UP populations during the LGM, population continuity through time is a parsimonious explanation. However, given the lack of pre-LGM genetic data from Iberia, the presence of Věstonice-like ancestry in Iberia before the LGM cannot be ruled out. The last had reached the Italian Peninsula, where it was later replaced by Epigravettian-associated Villabruna-like ancestry26 and a similar replacement scenario could also be possible for Iberia.

Signals of deep ancestry

Recent studies have shown that IUP individuals from Bacho Kiro (45 cal kyr bp, Bulgaria), Tianyuan (40 cal kyr bp, China) and Goyet Q116-1 (35 cal kyr bp, Belgium) carried ancestry from an IUP population which had inhabited Eurasia before the split of West Eurasian and East Asian populations24 (Supplementary Information 5). Using f4-statistics of the form f4(MLZ, Kostenki14; test, Mbuti) and Kostenki14 as the baseline for European Palaeolithic ancestry26 we show that MLZ shares excess genetic drift with Bacho Kiro IUP, Goyet Q116-1 and Tianyuan (Fig. 3a, Supplementary Fig. 5a and Supplementary Table 2.5).

Fig. 3: Attraction of MLZ to old UP individuals.
figure 3

a, The f4-statistics of the form f4(MLZ, Kostenki14; test, Mbuti) show significant positive affinity of MLZ to IUP Bacho Kiro, Tianyuan and Aurignacian Goyet Q116-1 (Supplementary Table 2.5). b, The f4-statistics of the form f4(test, Kostenki14; Tianyuan, Mbuti) show significant positive affinity of MLZ and Goyet Q116-1 to Tianyuan (Supplementary Table 2.6). For all f4-statistics, error bars indicate ± 3 s.e. and were calculated using a weighted block jackknife83 across all autosomes on the 1,240,000 panel (nSNPs = 1,150,639) and a block size of 5 Mb; | Z | > 3 points with thicker outline.

Our results confirm that part of the IUP ancestry present in Bacho Kiro and Tianyuan also survived in Southern Iberia MLZ until 23 cal kyr bp, ~12,000 yr later than the Aurignacian-associated Goyet Q116-1, the youngest previously known individual with traces of this ancestry. Initially, this IUP ancestry was attributed to East Asians as it is present in higher proportion in the Tianyuan individual, who shares more alleles with present-day East Asians than present-day or ancient Europeans26,45. The same type of ancestry was also observed in Goyet Q116-1 (ref. 26), who is more closely related to modern and ancient Europeans but still shares excess affinity to Tianyuan. Others45 have postulated an early pan-Eurasian population, which predated the split time of Europeans and Asians, as opposed to a back migration from Tianyuan-related Asian groups into Europe after the split. The oldest genomic data available from Bacho Kiro cave support the hypothesis of the existence of an Early Eurasian Bacho Kiro-like population that contributed to Ust’ Ishim, Tianyuan, Goyet Q116-1 and now also MLZ, but to the exclusion of other UP populations, including pre-Gravettian-associated Sunghir and Kostenki14, central European Gravettian or Magdalenian-associated individuals (Goyet Q2 cluster)24. Using f4-statistics of the form f4(test, Kostenki14; Tianyuan, Mbuti), we observe that Goyet Q116-1 and MLZ share more genetic ancestry with Tianyuan than Bacho Kiro IUP does (Fig. 3b and Supplementary Table 2.6). We also observe excess shared ancestry between Tianyuan and MLZ/Goyet Q116-1 when we replace Kostenki14 with Bacho Kiro IUP in f4(MLZ/GoyetQ116-1, Bacho Kiro IUP; Tianyuan, Mbuti), which also returns positive f4-statistics in both tests (MLZ, Z = 2.011; Goyet Q116-1, Z = 2.244) (Supplementary Table 2.7). This trend was already observed for Goyet Q116-1 and was explained by a higher shared ancestry between the latter and Tianyuan24. The Tianyuan-related ancestry present in MLZ might be fully inherited from Goyet Q116-1 as both are symmetrically related to Tianyuan and Bacho Kiro IUP individuals as shown by f4-statistics of the form f4(MLZ, GoyetQ116-1, Tianyuan, Mbuti) (Z = 0.282) and f4(MLZ, GoyetQ116-1, Bacho Kiro IUP, Mbuti) (Z = 0.705) (Supplementary Table 2.7). We also observe a subtle attraction between Ust’Ishim and MLZ by obtaining overall negative values using f4(test, Ust’Ishim; MLZ, Mbuti), which is consistent with an IUP Bacho Kiro-like group contributing ancestry to Ust’Ishim, Tianyuan, Goyet Q116-1 and, more distantly, to MLZ (Supplementary Fig. 5b, Supplementary Table 2.8 and Supplementary Information 3). Similar levels of Tianyuan and Bacho Kiro IUP attraction between MLZ and Goyet Q116-1, and its persistence over time in MLZ, suggest that this type of early Eurasian ancestry survived in a diluted form in western-most Europe. Ultimately, this observation posits a connection between Aurignacian- and Solutrean-associated individuals in western Europe, the IUP in eastern Europe and Tianyuan in the East and that this genetic legacy persisted in Iberia for ~20,000 yr more (MLZ, ~23 cal kyr bp), while in central (and presumably eastern) Europe, it was superseded and already no longer traceable in pre-Gravettian/Gravettian-associated individuals (~30 cal kyr bp). These results suggest genetic continuity from a population broadly associated with the Aurignacian and represented by Goyet Q116-1 to a population associated with the Solutrean and represented by MLZ in western Europe.

In the case of MLZ, we infer that this type of ancestry must have been brought to southern Iberia by individuals associated with the Aurignacian (sensu lato) as the archaeological record only provides evidence of Early UP industries (for example, Châtelperronian) in northern Iberia which is broadly attributed to Late Neanderthals and not modern humans46. Entering the peninsula from southern France, archaeological remains securely assigned to the Proto or Early Aurignacian are only found in northern Iberia47,48,49. For the sites Bajondillo near Málaga50 and Lapa do Picareiro in central Portugal51 the presence of an Early Aurignacian technocomplex has also been reported but challenged by several scholars49,52.

Signals of admixture in MLZ

We also tested whether MLZ carried UP HG-like ancestries that differ from the Aurignacian-associated individual Goyet Q116-1. First, we continued with f4-statistics of the form f4(MLZ, Kostenki14, test, Mbuti) to explore the genetic relationship of MLZ to other ancient individuals. We find a subtle, but non-significant, signal of shared drift with individuals that carried Villabruna-like ancestry MLZ (Z = 2.972) which is absent in Goyet Q116 (Z = 1.783) (Supplementary Table 2.10). We could confirm that the amount of Villabruna-like ancestry in MLZ is less than in Magdalenian-associated individuals from the Goyet Q2 cluster calculating f4(MLZ/GoyetQ2 cluster, Goyet-Q116-1; Villabruna, Mbuti) (Supplementary Table 2.9). Here, we obtain significantly positive f4-statistics for all Magdalenian-associated individuals, which indicates a contribution of Villabruna-like ancestry to Goyet Q2 cluster individuals. Testing MLZ resulted in a non-significant Z-score (1.628). However, taking these results into account, a slightly higher amount of Villabruna-like ancestry in MLZ than in Goyet Q116-1 cannot be ruled out (Supplementary Table 2.9).

We also observe consistently negative f4-statistics using f4(MLZ, GoyetQ2 cluster; Villabruna, Mbuti) with Z-scores ranging from −9.568 (El Mirón) to −0.091 (Hohle Fels), which suggests excess affinity to Villabruna-like ancestry in the Goyet-Q2 cluster individuals when compared to MLZ, even though the f4-statistics are not always significant (| Z | > 3) (Supplementary Table 2.9). For the Iberian Peninsula, we find that MLZ shares less drift with Villabruna-like individuals than the later Magdalenian-associated El Mirón, which suggests that the incoming Villabruna-like ancestry did not arrive in Iberia during the time when the Solutrean technocomplex prevailed but probably later; or, alternatively, that it had not yet reached the southern part of the peninsula.

Interestingly, MLZ indicates a significantly positive attraction to Natufians (Z = 3.541) using f4(MLZ, Kostenki14; Natufian, Mbuti) (Supplementary Table 2.10). When comparing the different affinities to Villabruna and Natufian in an extension of the comparable f4-test settings, we observe the following pattern: all post-LGM groups/individuals show significant attraction to Villabruna and Natufian-like ancestry, whereby Villabruna constitutes the type of ancestry which results overall in higher f4-statistics (Fig. 4a and Supplementary Table 2.10). This pattern is clearest in well-covered WHG and Magdalenian-associated individuals but a similar trend is seen in the low-coverage individuals from the two groups. Others26 have already described an increased affinity of Near Eastern populations to the Villabruna/WHG cluster after 14 cal kyr bp and, consequently, suggested a contribution from ancient Near East groups to the Villabruna cluster in a southeastern European refugium before, that is, during the LGM or earlier. Villabruna-like ancestry was also detected in the non-basal Eurasian part of the ancestry of Anatolian HG and Natufians, which suggested bidirectionality, that is a contribution of Villabruna-like ancestry to ancient Near Easterners before ~15 cal kyr bp.

Fig. 4: Testing for the presence of Near Eastern ancestry in MLZ.
figure 4

a, Differential genetic affinity of Pleistocene European HG and Iberian Holocene HG with Natufian and Villabruna-like ancestry calculated using f4(test, Kostenki14; Natufian, Mbuti) shown as green circles, and f4(test, Kostenki14; Villabruna, Mbuti) shown as blue circles. Bold outlines indicate significant f4-statistics | Z | > 3 (Supplementary Table 2.10). Results show that MLZ is the only Pleistocene HG that indicated a significant and equal attraction to both Natufian and Villabruna-like ancestries. b, The f4-statistics of the form f4(MLZ, GoyetQ116-1; Villabruna/Natufian/Morocco Iberomaurusian, Chimp/Mbuti), used to identify the group that shared most genetic drift with the MLZ lineage (Supplementary Table 2.14). For all f4-statistics, error bars indicate ± 3 s.e. and were calculated using a weighted block jackknife83 across all autosomes on the 1,240,000 panel (nSNPs = 1,150,639) and a block size of 5 Mb; | Z | > 3 points with thicker outline. ka, thousand years ago.

Here, we confirm a significant genetic affinity to Villabruna-like ancestry in pre-LGM Gravettian-associated individuals from central Europe but not (or to a much lesser extent) to Natufian-like ancestry. Of note, MLZ is one of the oldest individuals who shows a positive attraction to both Natufians (Z = 3.541) and Villabruna (Z = 2.972). By contrast, post-LGM populations share substantial ancestry with both Villabruna and Natufian but more with Villabruna than Natufians, in agreement with ref. 26 (Fig. 4a and Supplementary Table 2.10).

Others53 have shown that Natufians can be modelled using Villabruna-like ancestry and ‘Basal Eurasian’ ancestry, which constitutes an inferred population that diverged very early from all non-African populations after the split from African populations31. Our results show that MLZ shared an excess of Near Eastern ancestry present in Natufians, which is not explained by Villabruna-like ancestry itself (the oldest WHG with Near Eastern affinity). By contrast, we neither found an indication for Basal Eurasian ancestry in MLZ using other tests (Supplementary Information 6, Supplementary Fig. 6 and Supplementary Table 2.12) nor a high percentage of Neanderthal ancestry (Supplementary Information 7, Supplementary Fig. 7 and Supplementary Table 2.13).

Taken together, we tentatively conclude that the Villabruna-like ancestry present in Natufians and MLZ differs from the one represented by Villabruna. The putative contributing lineage is nonetheless related to Villabruna-like ancestry but carries a higher proportion of Near Eastern ancestry and is present in an admixed form in Natufians.

Finally, we attempted to model these genetic events by performing a reconstruction of the phylogenetic position of MLZ individual using qpGraph (Supplementary Information 8). We found statistical support for a model where MLZ represents a mixture of a population that shared a common ancestor with Goyet Q116-1 (84%) and a population that is ancestral to Villabruna and the clade of all WHG (16%). Magdalenian-associated ancestry represented by El Mirón was found to be a mixture of ancestry similar to MLZ and ancestry similar to Villabruna (Supplementary Figs. 8 and 9).

Testing for North African ancestry

Given the southernmost location of MLZ in southwest Europe, only ~300 km across from the North African coast and the confirmation of Near Eastern ancestry, we explored a potential trans-Gibraltar connection using f4(MLZ, GoyetQ116-1; test, Chimp), where we compare MLZ and Goyet Q116-1 directly with Morocco Iberomaurusian, Natufian and Villabruna as test populations (Fig. 4b and Supplementary Table 2.14). All resulting f4-statistics are positive, indicating a higher affinity to Near Eastern ancestry in MLZ when compared to Goyet Q116-1. Among the three tests, Natufian (Z = 2.74) and Morocco Iberomaurusian (Z = 2.46) share more genetic drift with MLZ than Goyet Q116-1. We observe a non-significant shift when comparing Morocco Iberomaurusian with Villabruna or Natufian, which suggests the absence of a direct contribution from Morocco Iberomaurusian-like ancestry (who also carry Sub-Saharan-like ancestry) to MLZ (Fig. 4b and Supplementary Table 2.14). However, the Near Eastern-like ancestry represented by Natufians, which is also the major component of Morocco Iberomaurusian, could have spread on both sides of the Mediterranean, where it (later) mixed with Sub-Saharan ancestry in Africa (as seen in Morocco Iberomaurusian) and with Villabruna-like ancestry on the European side of the Mediterranean.

The genomic legacy of Solutrean-associated HGs in later periods

We next explored whether the genetic legacy of MLZ was still detectable in Holocene HGs from Iberia. Traces of Goyet Q2-like ancestry were shown to be present at higher proportions in southern Iberia than in the North, where in turn the proportion of WHG/Villabruna-like ancestry was higher29.

Applying the same f4(test, Kostenki14; Tianyuan, Mbuti) as in Fig. 3, we observed a positive deviation from zero when the Mesolithic individual Moita do Sebastião was tested (Z = 2.783), which suggests a subtle affinity to East Asian Tianyuan similar to the one found in MLZ and which, consequently, argues for a persistence of this ancestry in southern Portugal since the UP (Fig. 5a, Supplementary Table 2.6 and Supplementary Information 5). Moita do Sebastião also shows the highest affinity to Tianyuan when all European Mesolithic HGs were tested in an outgroup-f3(Mesolithic, Tianyuan; Mbuti) (Fig. 5b, Supplementary Table 2.15 and Supplementary Information 5). Currently, the strongest affinity to Tianyuan in Holocene European HGs was reported for Eastern European HGs (EHG). This is because the ancestry found in Mal’ta and Afontova Gora individuals (Ancient North Eurasian ancestry) received ancestry from UP East Asian/Southeast Asian populations54, who then contributed substantially to EHG55. However, observing early Asian ancestry in Mesolithic Portugal and EHGs from Russia, at geographically opposite corners of west Eurasia, but not in central Europe, rules out the possibility that ancestry similar to Tianyuan was transmitted through the EHG–WHG admixture cline observed in Mesolithic Europe56 (Fig. 5b). On the contrary, this result supports the idea of genetic continuity from (at least) the LGM to the Mesolithic in southern Iberia, while other pre- and post-LGM population expansions diluted much of the subtle signal in most other parts of Europe.

Fig. 5: Traces of deep IUP ancestry in Holocene HGs.
figure 5

a, The f4-statistics of the form f4(test, Kostenki14; Tianyuan, Mbuti) show affinity of Moita do Sebastião (non-significant) and EHGs (significant) to Tianyuan (Supplementary Table 2.6). Error bars indicate ± 3 s.e. and were calculated using a weighted block jackknife83 across all autosomes on the 1,240,000 panel (nSNPs = 1,150,639) and a block size of 5 Mb; | Z | > 3 points with thicker outline. b, The f3-outgroup statistics of the form f3(Tianyuan, test; Mbuti) measuring the shared genetic drift between the test population and Tianyuan, highlighting Moita do Sebastião as the Mesolithic Iberian HG with highest shared genetic drift with Tianyuan, indicative of genetic continuity from the Solutrean period in Southern Iberia. Similar f3-outgroup statistics results were obtained for EHGs (Supplementary Table 2.15).

We also looked for evidence of a gene flow from North Africa to southern Iberia. We tested for Morocco Iberomaurusian-like ancestry in HGs using f4-statistics of the form f4(test, Kostenki14; Morocco Iberomaurusian, Mbuti) (Supplementary Fig. 10a and Supplementary Table 2.16). We find that all populations with WHG (or Near Eastern) ancestry returned positive f4-statistics, which attests to the shared Near Eastern ancestry common to HG groups on both sides of the western Mediterranean and suggests genetic continuity in southern Iberia. However, when we tested specifically for the Sub-Saharan component of Morocco Iberomaurusian ancestry in MLZ and Moita do Sebastião using f4(Morocco Iberomaurusian, Natufian; test, Chimp) we obtained negative results, indicative of excess affinity to the Near Eastern but not to Sub-Saharan ancestry (Supplementary Fig. 10b, Supplementary Table 2.17 and Supplementary Information 9).

With the expansion of farming practices during the Early Neolithic (EN) period, a new form of ancestry reached Iberia55,57. These farming groups also carried a small proportion of HG-like ancestry due to local admixture processes along the routes of expansion56. In Iberia, we showed that it is possible to track the dual ancestry contributions of Villabruna/WHG-like ancestry and Goyet Q2-like ancestry in Neolithic and Copper Age (CA) individuals associated with farming practices29,58. To explore the potential genetic legacy of MLZ-like ancestry in later time periods, with a particular focus on southern Iberia, we report and co-analyse two new EN individuals from Cueva de Ardales and Necrópolis de las Aguilillas and two CA individuals from Necrópolis de los Caserones and Cueva de Ardales and one individual from Cueva de Ardales that post-dated the CA period (Supplementary Table 1.2), together with published data from Iberia29,40,58,59,60. EN individuals from Cueva de Ardales (ADS005) and Las Aguilillas (AGS001) cluster in principal component analysis (PCA) space with other Neolithic individuals from Iberia and France. With the exception of Murciélagos, the southern Neolithic forms a separate subcluster within the Iberian EN cluster, with a slight shift upwards on PC2 and left on PC1 towards Iberian HGs (Supplementary Fig. 11a and Supplementary Information 10). Consequently, we grouped all individuals as Southern_Iberia_EN (excluding Murcielagos). After ruling out contributions of North African ancestry in Southern_Iberia_EN (Supplementary Fig. 11b,c, Supplementary Table 2.18 and Supplementary Information 10), we used qpAdm to model distal sources of genetic ancestry. We tested several combinations of two- and three-way models (Anatolia Neolithic + WHG or Anatolia Neolithic + WHG + either Iran_N, MLZ or Jordan PPNB) with the aim of characterizing the potential additional source(s) of ancestry in Southern_Iberian_EN. We focussed on the different HG ancestries present in the region (for example, MLZ or Goyet Q2-like components) and potentially different Neolithic ancestries, such as Jordan_PPNB or Iran_N-like ancestries, which have been described in some EN groups from the Mediterranean61. We successfully modelled the HG ancestry in Southern_Iberia_EN with WHG and MLZ-like ancestry and show that the HG component is larger than in Northern_Iberia_EN individuals, in agreement with previously published data29,58. Models with temporally and geographically proximal sources were also supported. However, we cannot distinguish further between MLZ, El Mirón or Moita do Sebastião-like HG ancestries, illustrating the limits of data resolution (Supplementary Fig. 11d and Supplementary Table 2.19). The higher amount of HG ancestry and the Solutrean/Magdalenian-associated genetic legacy suggest a much closer genetic interaction between HGs and farmers in the southern Iberia, perhaps as a result of an earlier spread of farmers (longer co-existence) or more and stronger admixture pulses than in northern Iberia.

Chalcolithic individuals from Necrópolis de los Caserones (CRS002) and Cueva de Ardales (ADS008) cluster with other Southern Iberian CA populations (Supplementary Fig. 11a). The position of ADS007 from Cueva de Ardales in PCA space posits the presence of ‘steppe-related ancestry’, which was confirmed by several tests applied. Additionally, a non-local status can be suggested for this individual (Supplementary Information 10 and Supplementary Tables 2.202.22).

Conclusions

Genome-wide data from the first Solutrean-associated individual MLZ from Andalucia revealed traces of ancestry from an IUP population that predates the genetic split between European and Asian populations but is still traceable in southern Iberia ~23,000 years ago. This genetic ancestry was also found in the Aurignacian-associated individual Goyet Q116-1, Bacho Kiro and Tianyuan, respectively.

We can also show that pre-LGM (Věstonice-like ancestry) and post-LGM (Goyet Q2-like ancestry) groups were separated during the LGM as we find no substantial traces of Věstonice-like ancestry contribution in southern Iberia. This suggests a scenario in which Gravettian-associated individuals in western Europe were genetically distinct from those in central Europe. It is also possible that Věstonice-like ancestry in southern Iberia had been replaced when populations retreated further south during the height of the LGM, or had not initially reached the most southernmost parts of the Iberian Peninsula. Individual MLZ also carried ancestry that is shared with Near Eastern Natufian-associated individuals, confirming the presence of this ancestry in Europe before the LGM. The MLZ lineage contributed substantially to post-LGM Magdalenian-associated individuals, which attests to genetic continuity in western Europe that spans the LGM. While more complex scenarios are possible, the observed genetic continuity suggests that the Iberian Peninsula, as a ‘southern genetic refugium’, could have sustained a stable population before, during and after the LGM, with no evidence for significant population turnover events but followed by an early and substantial contribution of Villabruna-like HG ancestry soon after. However, this changed profoundly with the arrival of EN farmers, who brought new ancestry from western Anatolia and the Near East. Southern Iberia in particular retained a higher proportion of HG ancestry related to Solutrean- and Magdalenian-associated individuals than other regions of the peninsula. We refer readers to ref. 62, which reports new genomic data from 116 HG individuals from Palaeolithic to Mesolithic Europe.

Methods

Direct AMS 14C

We performed radiocarbon dating on the same skeletal element used for aDNA analysis following refs. 63,64,65,66. Collagen extraction was performed in the Department of Human Evolution at the Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, and 14C measurements were done in Curt-Engelhorn-Zentrum Archäometrie gGmbH, Mannheim. Calibrations were performed using OxCal v.4.4 (ref. 67) and the IntCal 20 curve68 (Supplementary Table 1.2).

aDNA analysis

We extracted and prepared DNA for next-generation sequencing in two different dedicated aDNA facilities (Jena and Leipzig). All the 16 human samples were teeth. When possible, after cleaning, we split the crown and root by a hand saw and drilled both from inside. Protocols used for aDNA extraction and non-UDG-treated, single-stranded library preparation are described in ref. 33. Following the quality assessment (percentage of human DNA, characteristic aDNA damage, percentage of complexity/clonality) of the initial shotgun sequencing data from several DNA libraries from each of the 16 individuals sampled, 8 libraries with >0.08% endogenous DNA or higher were subsequently enriched for ~1,240,000 SNPs69, 26 libraries from 12 individuals were enriched for the complete mitogenome25,70 and 18 libraries from four males for mappable regions of the non-recombining parts of the Y chromosome71 using independent DNA–DNA hybridization capture techniques. Following DNA capture, libraries were sequenced for 20–40 million reads using single end configuration (1 × 75 base pair (bp) reads) on a Illumina HiSeq4000 platform. The success rate of 50% is explained by the challenging climatic conditions and that we could not select petrous bones that were shown to yield a high amount of endogenous human DNA such as refs. 72,73 but were limited to the few scattered skeletal elements.

Sequence data were demultiplexed on the basis of specific pairs of indexes and processed with the EAGER (1.92.59) pipeline (Supplementary Table 1.3). The pipeline includes adaptor removal (v.2.3.1)74, mapping against the Human Reference Genome hs37d5 with BWA (v.0.7.12)75 using aln and samse commands (-l 16500, -n0.01, -q30) and removing duplicates with DeDup (v.0.12.2)76. Genotypes were called using pileupcaller (https://github.com/stschiff/sequenceTools) and the flag -singlestrandedmode, which entirely removed the aDNA damage.

aDNA authentication and quality controls

We first used MapDamage (v.2.0.6)77 to determine the deamination rate at both ends of the sequencing reads. We observed damage patterns consistent with non-UDG treatment in most cases (Supplementary Fig. 1a). We used sex determination via the rescaled X ratio versus Y ratio scatter plot as quality control for contamination from the opposite sex of the individual analysed (Supplementary Fig. 1b,c). For a non-contaminated library we expect an X ratio of ~1 and a Y ratio of ~0 for females and an X and Y ratio of ~0.5 for males. We also applied PMD filtering tools78 to evaluate the deviation from the non-PMD-filtered version of the individual on a PCA (Supplementary Fig. 2), as well as the replication of all the tests shown in this work using the PMD-filtered version too. Finally, we applied a quantitative method to estimate nuclear contamination in males using the ANGSD79 which estimates heterozygosity at polymorphic sites on the X chromosome in males (Supplementary Table 1.8). The last quantitative method applied to estimate contamination in genetic males and females was the estimation of mitochondrial contamination using ContamMix80, which quantifies the heterozygosity on the individual mitochondrial reads with a comparative mitochondrial dataset of 311 global mitogenomes (Supplementary Table 1.9).

Biological relatedness

We calculated the pairwise-mismatch rate (PMR)81 between ancient individuals based on pseudo-haploid 1,240,000 SNP capture data. To estimate the degree of relatedness among MLZ libraries, we calculated the baseline PMR for identical samples/twins using several libraries from our MLZ individuals and we compared the mean value with the merged MLZ003 versus the merged MLZ005 PMR value. On the basis of the PMR results we concluded that samples MLZ003 and MLZ005 were from the same individual (Supplementary Table 1.6). We replicated the process separately for new EN and CA individuals from Cueva de Ardales and Necrópolis de las Aguilillas.

To further validate the finding that MZ003 and MLZ005 are the same individual, when we lacked a robust estimate of the background relatedness, we performed the following test: we calculated the PMR for all pairs of libraries for MLZ003 and MLZ005, which yielded three categories of PMRs—comparisons between MLZ003 libraries (MLZ003/MLZ003), comparisons between MLZ005 libraries (MLZ005/MLZ005) and comparisons between MLZ003 and MLZ005 libraries (MLZ003/MLZ005). Leveraging the fact that the ranks of the PMR values will be invariant to normalization for background relatedness, we used non-parametric, pairwise Wilcoxon rank sum tests to evaluate if there was a significant difference in distribution for the PMR values from the three categories. We found no such difference (adjusted P > 0.15), even when we restricted PMR values from pairs with >500 overlapping SNPs (P = 0.1974). Since we find no statistically significant difference between the distributions of the PMR values for combinations of libraries of MLZ003/MLZ003, MLZ003/MLZ005 and MLZ005/MLZ005, we have no evidence to suggest that MLZ003 and MLZ005 are not the same individual.

Datasets

We merged the newly reported ancient data with data from the Allen Ancient DNA Resource (v.37.2; https://reich.hms.harvard.edu) plus the newly published data from refs. 24,27,28,38. We generated two datasets, the 1,240,000 genotype set used for all the statistics presented in the paper and one with the intersected SNPs from the Human Origins panel (~600,000 SNPs) used to perform PCA analyses. In total, 179 Palaeolithic and Mesolithic individuals, all covered at >20,000 SNPs, were used in this study.

Population genetic analysis

Principal component analysis

We used the smartpca software from the EIGENSOFT package (v.6.0.1)82 with the present-day groups (global PCA), present-day west Eurasian groups (West Eurasian PCA) and present-day west Eurasian, North African and Sub-Saharan groups (West Eurasians–North Africans–Subsaharans). The ancient individuals were projected onto the PCA scaffold which was calculated using the modern individuals using the parameters ‘lsqproject:YES’ and ‘shrinkmode:YES’. We project the genotyped PMD-filtered version and the non-PMD-filtered version to evaluate potential signs of contamination78. A global PCA was used to infer East Asian ancestry and by adding North Africans and Sub-Saharan populations to the West Eurasian PCA we could infer potential African ancestry.

F-statistics

F-statistics were calculated with qpDstat from ADMIXTOOLS (https://github.com/DReichLab). The f3-outgroup statistics were used to calculate the affinity matrix with all the possible HG pairwise combinations. The f4-statistics were used to test for cladality and admixture. Standard errors were calculated with the default block jackknife and all plots display 3 s.e.

Multidimensional scaling analysis

We applied MDS using the R package cmdscale. Euclidean distances were computed with the genetic distances calculated from the f3-outgroup matrix on the form 1 − f3 pairwise values among all possible HG pairwise combinations following ref. 26.

qpGraph

We used qpGraph from ADMIXTOOLS (https://github.com/DReichLab) to construct the phylogeny of our Palaeolithic individual from Cueva del Malalmuerzo. We built the hypothetical topology of the tree based on the previous statistical analysis (f3- and f4-statistics) and used qpGraph to clarify the order of the genetic events that were inferred previously. The trees were built by order of complexity and the ones with the difference between the observed and fitted f-statistics | Z | > 3 were rejected. We also excluded models with 0% ancestry stream estimates. We used the options ‘outpop: NULL’ rather than specifying an outgroup population, ‘useallsnps: YES’, ‘lambdascale: 1’ and ‘diag: 0.0001’ as used in ref. 54.

qpWave and qpAdm

To estimate admixture proportions we used the qpWave and qpAdm programs from the ADMIXTOOLS v.5.1 package (https://github.com/DReichLab), with the ‘allsnps: YES’ option. With qpWave, we evaluate the number of sources needed to model our target population. With qpAdm, we quantified the proportion of genetic ancestry contributed by each source in the target population. Programs qpWave and qpAdm were used in post-Palaeolithic populations co-analysed in this study.

Mitochondrial haplogroup assignment

We extracted reads from the mitocapture that mapped exclusively to the mitochondrial reference and built consensus sequences using sites which had been covered by a minimum of two reads and had a minimum allele frequency of 0.1. Consensus sequences were uploaded to HaploGrep2 v.2.1.1 (available via https://haplogrep.uibk.ac.at/) for an automated mitochondrial haplogroup assignment based on phylotree (mtDNA tree build 17, available via http://www.phylotree.org/) (Supplementary Table 1.11).

Y-haplogroup assignment

We genotyped the Y chromosome reads using a Y-SNP list from the ISOGG (International Society of Genetic Genealogy v.15.73) dataset included in the 1,240,000 and the in-house Y-capture probes using the procedure described in ref. 71 (Supplementary Table 1.2).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.