Lager yeast history has intrigued scientists for decades. The recent isolation of S. eubayanus, the lager yeast ancestor, represents an unprecedented opportunity to extend our knowledge on yeast phylogeography and the origins of the S. pastorianus lager hybrid. However, the genetic, phenotypic and evolutionary history of this species remains poorly known. Our work demonstrates that S. eubayanus isolates from Patagonia have the greatest genetic diversity, comprising the largest number of lineages within a single geographic region and experienced ancestral and recent admixture between lineages, likely suggesting co-occurrence in Patagonia. Importantly, some isolates exhibited significant phenotypic differences for traits such as high temperature and ethanol tolerance, together with fermentation performance, demonstrating their potential in the brewing industry for the generation of new styles of lager beers. Furthermore, our results support the idea of colonization from peripheral glacial refugia from the South, as responsible for the high genetic diversity observed in southern Chilean Patagonia. Our results allow hypothesizing a successful physiological adjustment of the species to the local conditions in Patagonia, explaining its wide distribution in the southern hemisphere.
The identification of admixed and/or hybrid individuals in nature together with the geographic distribution of ecologically relevant traits is crucial for the understanding of how new lineages diversify in a given climate/region/environment . Unfortunately, studies addressing admixture in fungi are still uncommon. In this context, the utilization of population‐level sampling and whole‐genome sequences involving large sets of individuals allows studying genetic correlations by exploring patterns of genomic variation across lineages and identifying signatures of gene flow, revealing potential mechanisms of physiological adjustment to the environment . The monophyletic Saccharomyces genus is an ideal model to investigate genomic variation and admixture, since within the clade isolates occupy a broad range of environments. Currently, the clade is composed of eight species , including the partially domesticated S. cerevisiae and other non-domesticated species, such as S. eubayanus . Given the economic importance of this clade, as well as the wealth of genomic information that has been produced in the past decade, particularly for the model organism S. cerevisiae [5–10], natural populations of Saccharomyces are excellent models for understanding genome evolution in the wild.
Having been the first sequenced eukaryote, studies in S. cerevisiae have gained great depth by the wealth of additional full genome data coming from diverse isolates, providing exceptional new insights into the genomic processes that drive phenotypic traits and genome evolution between isolates [5, 6, 11]. Since the recent possibility to rapidly and cost-effectively sequence full genomes, other Saccharomyces genomes have been fully obtained . Saccharomyces species harbour different genetic distributions, population histories and unique phenotypic properties . Phylogenomics analyses in S. uvarum suggested that a Patagonian sub-population gave rise to the Holarctic population through a recent bottleneck . Similarly, recurrent hybridization events between lineages were reported in S. paradoxus , demonstrating how recurring hybridization events in nature contribute to genomic, phenotypic and potentially the species diversity . In this context, admixture in yeast has been associated to transient habitats, like fruits, where insects can carry live yeast spores and favour a wider geographic distribution and genetic admixture compared to less nutrient rich barks, where admixture could occur over longer periods [15, 16].
In nature, several Saccharomyces inter-species hybrids have been found. An example of this includes the workhorse of the modern brewing industry, S. pastorianus, a hybrid between S. cerevisiae and the cold-tolerant S. eubayanus [17, 18]. Despite the industrial importance of S. pastorianus, much of the natural history of this hybrid remains obscure, largely because the S. eubayanus parental species was only recently isolated . S. eubayanus was originally isolated from Nothofagus trees in the Argentinian Patagonia [19, 20] and since then it has been isolated in New Zealand , North America [22, 23] and East Asia . However, the evolutionary origin of S. eubayanus is still controversial. While this species has been isolated from South American Nothofagus trees recurrently  and only a handful of isolates have been recovered from trees in China and North America [24, 25], a subset of the strains from China have been reported as the earliest diverging lineage, suggesting an Asian origin of the species . However, these findings have been challenged . Molecular profiling indicates that S. eubayanus is composed of three lineages, besides the early diverging lineage of West China. These populations include a ‘Holarctic’ (HOL) cluster (a group of related strains from Tibet and North America) and two Patagonian populations denominated: ‘Patagonia A’ (PA) and ‘Patagonia B’ (PB) . Whole genome sequence comparison among wild S. eubayanus strains indicates that, thus far, the Holarctic lineage is the closest relative of the lager yeast [20, 25]. Interestingly, multi-locus sequence comparisons have indicated that the nucleotide diversity of S. eubayanus Patagonian populations is higher than that of the West China and the Holarctic (North American) lineages. However, until now only a handful of S. eubayanus genomes per population have been fully sequenced, which prevents depicting a detailed population genomics portrait of this species.
Here, we describe the isolation of 160 S. eubayanus strains from bark samples obtained from Nothofagus trees in Chile (Fig 1) and we provide annotated genomes together with phenotypic characterization for 82 selected strains. We investigated genetic structure, admixture and nucleotide diversity of this set of strains, while also re-analyzed 23 previously published genomes. Overall, we provide evidence of historical admixture events in Patagonia that significantly expands the formerly known genetic history of the species. Moreover, phenotypic clustering correlated well with genetic distances, where individuals from northern sites showed greater fermentation performance and high-temperature tolerance than isolates from southern sites. The genomic data presented here broadens our knowledge of the genetics, ecology, and evolution of wild yeast strains.
In order to determine the distribution of S. eubayanus along the south western side of the Andes Mountains, we sampled ten Chilean national parks and reserves between 2017 and 2018, spanning 2,090 km from Altos de Lircay National Park in central Chile (VII Maule Region, Chile) to Karukinka Natural Park in southern Chile (XII Magallanes Region, Chile) (Fig 1A). From these sites, we obtained 553 bark samples from trees belonging to N. pumilio, N. Antarctica, N. dombeyi and A. Araucana species. Raffinose and ethanol media enrichment  allowed us to recover yeast colonies in 77% of the samples. Potential Saccharomyces strains were identified by sequencing the ITS and/or GSY1 and RPI1 RFLP [22, 25]. From these, 160 S. eubayanus strains were identified from different individual trees (S1 Table, representing 28.9% of the samples), and in parallel, another set of 179 S. uvarum isolates were recovered (representing 37.9% of the samples), together with dozens of non-Saccharomyces species belonging to the Lachancea, Kregervanrija, Kazachstania and Hanseniaspora genera. In general, we observed a pattern between yeasts and hosts, where N. pumilio and N. antarctica contained mostly S. eubayanus strains, while all but one isolate derived from N. dombeyi and A. araucana samples were identified as S. uvarum (S1 Table). The frequency of yeast isolates was higher towards southern regions, while a lower fraction of yeast colonies were obtained from Tierra del Fuego. FACS analysis revealed that all samples were diploids, except for CL609.1 and CL1005.1 which were found to be haploid and tetraploid, respectively (S1 Fig). Indeed, all strains were able to sporulate, with the exception of CL609.1. Overall, our results demonstrate the high frequency of the S. eubayanus species above latitude 33° in the western side of the Andes Mountains.
To investigate the genomic variation and population structure of the S. eubayanus isolates, we sequenced the genomes of 82 strains, randomly selected from the ten sampling sites. Furthermore, we combined our dataset with all previously published genome from 23 strains from North America (9 strains), Argentina (10 strains) , China (1 strain) , New Zealand (1 strain)  and two lager genomes , allowing us to incorporate genomes obtained in other geographical regions (S2A Table and S2B Table). Overall, no strain isolated in this study represents a close relative to the type strain (obtained from Argentina), suggesting that the Andes Mountains represents a natural barrier between S. eubayanus populations (S2B Table). On average, across the 82 genomes we obtained 39,024 SNPs per strain relative to the reference genome, and a SNP was found on average every ~300 bp. In parallel, we found on average 1,606 insertions and 1,677 deletions per isolate relative to the type strain (S2A Table). The phylogeny obtained suggests that Chilean strains displayed different ancestry and mostly fell into three major lineages: PB-1, PB-2 & PB-3 (Fig 2A). Furthermore, several strains fell outside the major PB lineages and might represent admixed strains (Fig 2A).
In order to study the genetic structure of S. eubayanus, we used STRUCTURE, fineSTRUCTURE clustering and principal component analysis (PCA). All analyses revealed a high degree of differentiation across the populations (Fig 2). STRUCTURE analysis indicated an optimum k = 5 groups (ΔK5 = 2,652, Fig 2B, S3 Table), whereas fineSTRUCTURE was able to depict several subpopulations within the PB-1 lineage, and to a lower extent also in PB-3 and PB-2, respectively (S2 Fig), in agreement with our PCA analysis (Fig 2C). Moreover, three groups of strains showed a mixture of alleles inherited from multiple populations (SoAm 1–3), that together with the North American (NoAm), Holarctic and PA lineages shape the genetic structure of S. eubayanus (Fig 2D and 2E).
In order to detect evidence of recombination in the different lineages, we estimated linkage disequilibrium (LD) using only non-admixed strains. Estimates of LD based on r2 values differed between the three PB lineages (Fig 3A). Lineages showed a 50% LD decay of 2.9 kb, 29.1 kb and 22.5 kb in the PB-1, PB-2 and PB-3 populations, respectively, demonstrating a population-specific LD decay and greater LD levels in the PB-1 population. Moreover, the PB-1 level of recombination was similar to what is described in domesticated S. cerevisiae populations [5, 7]. This could be explained by greater inbreeding or outbreeding rates in the lineage. Indeed, Fis values (Wright’s inbreeding coefficient) were significantly higher (p-value < 0.001, paired Student t-test) in PB-1 (average Fis = 0.9482 CI = 0.9462–0.9503), compared to PB-2 and PB-3 (average Fis = 0.9017 (CI = 0.8974–0.906 & Fis = 0.8856 CI = 0.8795–0.8916, respectively) (S4A Table). Alternatively, lower LD levels for PB-2 and PB-3, could be due to the hidden population substructure within these two lineages compared to PB-1, impacting the overall comparison.
Next, we calculated nucleotide diversity (π), genetic differentiation (FST), and neutrality test statistics, such as Tajima’s D (S4A Table). In general, we observed a trend for higher genome-wide nucleotide diversity (π) in southern localities, and belonging to PB-1, compared to their northern counterparts belonging to PB-2 and PB-3 (Fig 3B). In this context, PB-3 isolates from Choshuenco and Villarrica, located further north had the lowest levels of genetic diversity. These results demonstrate a negative and significant (p-value = 0.047, Spearman correlation coefficient rs = 0.632) correlation between latitude and genetic diversity.
Tajima’s D scores differed between lineages. Specifically, Tajima’s D values for PB-2 & PB-3 were positive while this metric was near zero for PB-1, suggesting undetected population substructure in the former case and no population decline for the latter (S4A Table, S3A Fig). We then assessed the degree of genetic differentiation and nucleotide diversity between the ten sampling sites per lineage (S3B Fig). We found moderate to high significant FST values ranging from 0.16 to 0.88 (S4B Table). A mantel test showed a significant correlation between the geographic and the genetic distances (isolation by distance, p-value < 0.05) among sampling sites of the PB-1 lineage. The number of pairwise comparisons was insufficient to test for IBD for lineages PB-2 and PB-3. The positive IBD for PB-1 indicates limited effective dispersal within this lineage (S3C Fig).
To determine how recombination events influenced the genomes of the isolates showing recent genetic admixture and the level of genetic exchange between populations, we explored their genome compositions and genetic origins. First, we generated similarity plots using 100 SNPs blocks and determined the closest genetic origin for each of the three groups of admixed strains from each population. This initial assessment of admixture allowed us to determine that strains isolated from Karukinka (SoAm-3) likely originated from the same admixture event between PB-1 and PB-3 lineages, as 92% of their bins were assigned to the same lineage (Fig 4A ). A common origin is also predicted for the North American admixed strains (NoAm) as previously reported [22, 23, 25], where a PB-1 origin contributes to the genome of all isolates.
To infer the most likely parental subpopulations, genetic contribution to the admixed genomes and putative date of admixture, each admixed individual/subpopulation was analysed using GLOBETROTTER. In particular, GLOBETROTTER indicated that the admixed population of Karukinka (SoAm-3) also originated from an admixture event between PB1-Karukinka and PB3-Puyehue individuals, which contributed to 43% and 57% of the current genomes, respectively. Similarly, a pairwise nucleotide divergence (Dxy) analysis largely resembled the results obtained using cluster analysis (Fig 4B). Finally, to verify recent admixture events between lineages, a TreeMix analysis recapitulated independent events between subpopulations while fitting 8 migration events, consistent with the admixture analysis previously performed and depicting hybridization involving in all cases the PB-1 lineage (Fig 4C). Altogether, these results suggest recent outcrossing and admixture between S. eubayanus populations, where the PB-1 branch contributes to all the admixed genomes analysed in our study.
To identify population founders and historical hybridization events we applied explicit tests of admixture using TreeMix and f4-statistics together with admixture models using admixturegraph. In this analysis, none of the strains coming from recent admixture events were considered, and only strains belonging to the different lineages were used. The TreeMix topology without migrations (S4A Fig and S4B Fig) resembles our phylogenetic tree and explained 99.61% of the variance of the data. Adding, one, to six migration events increased the variance explained by the model from 99.88% to 99.97%, respectively. An Ad hoc analysis of the second order rate of change in the log-likelihood as migration events were added indicated two or four migration edges as the most likely models (S5B Table). The model that included four migration events, indicated gene flow between PB-1 –HOL and PB-2 and HOL ancestral lineages and the PA ancestor together with PB-2 (Fig 5A and 5B). Interestingly, gene flow between S. cerevisiae and the HOL lineage was found, suggesting additional signatures of ancestral hybridizations between both species, or a close sister species to S. cerevisiae (S4C Fig and S4D Fig).
Next, we evaluated gene flow among S. eubayanus populations by calculating f4 statistics. As Treemix results suggested gene flow between S. eubayanus populations with different Saccharomyces sp, we did not use S. uvarum or S. cerevisae as outgroups. Instead, we calculated f4 statistics using only S. eubayanus populations and we tested whether f4 values agreed with S. eubayanus admixture models based on the phylogenetic tree topology, this is (PA,(HOL,(PB2,(PB1,PB3). We found a large discrepancy between calculated f4 values with the expected f4 values of a non-admixture model, suggesting the presence of admixture events between S. eubayanus populations (Fig 5C and 5D).
Next, we incorporated the three admixture events suggested by Treemix which improved the fit to f4 values (Model 2, S4E Fig, S5C Table), yet by adding one additional admixture event (PB-1—PA) we obtained Model 3 which had the lowest minimal error (0.000045, Fig 5E and 5F). In Model 3 ancestral populations of PB-1 and PB-2 had independent events of admixture with both, PA and HOL ancestral populations. Alternative models that could explain gene flow of the Patagonia B populations to either PA or HOL using fewer admixture events did not show a better fit (S5D Table). Altogether, our results demonstrate extensive gene flow and admixture between lineages. Furthermore, the current lineages do not correspond to the outcome of recent admixture events between any of the other lineages, in agreement with migration edges inferred by Treemix.
To compare the genome content, we generated de novo assemblies and constructed the pangenome across all isolates (S5E Table). We identified 5,497 non redundant pangenomic ORFs in the species. Out of these, 5,233 ORFs are core systematically present in all of the isolates, while 264 are dispensable (S5F Table), being only found in subsets of strains. A PCA analysis of the presence/absence profile of core genes was used to visualize potential overlap between genome content similarities and SNPs distance (S5A Fig). In partial concordance with the phylogenetic tree, the branch harbouring the Chinese isolate, which also contained two North American isolates, represented the most divergent lineage, suggesting a recent migration event between Asia and America.
Overall, we were able to identify nine ORFs present in a group of nine closely related isolates belonging to the PB-2 lineage, representing a candidate lateral gene transfer (LGT) event from other specie(s). Interestingly, six of the strains were isolated from the same location, and these genes were assembled within a region of 7.3 kb on a single contig for each of the six isolates (S. eubayanus Region A, S5B Fig). The sequences of these contigs could not be found in the NCBI non-redundant database, with the exception of a 7 kb region, upstream to the group of ORFs, which matches the subtelomeric end of chromosome I. Without a match in another species, we were not able to confidently assign these genes to an LGT event. However, their subtelomeric localisation, together with their assembly on a single continuous region in a subset of isolates represent hallmarks of LGT identified in other Saccharomyces species [5, 27]. Subsequently, SMART (Simple Modular Architecture Research Tool) was used to identify known PFAM protein domains in these regions. We found several putative proteins, with domains such as: arginase domain, a membrane transport protein domain, a Fungal specific transcription factor domain, a Gal4-like dimerization domain and a transmembrane one (S5C Fig and S5G Table). In addition to these high-confidence hits, SMART identified the potential presence of a glucosidase domain and a homing endonuclease domain in four and two ORFs, respectively (S5C Fig and S5G Table).
Furthermore, we also identified 64 private ORFs in the Chinese/North American group of strains, where at least 27 (S5H Table) correspond to orthologs of other genes found in the lager S. eubayanus genome (showing an average sequence divergence of 12.4%). These features closely resemble those introgressions found from S. paradoxus into specific S. cerevisiae lineages . However, no potential donor species was found and hence we cannot completely rule out different evolutionary origins, such incomplete lineage sorting and/or balancing selection.
S. eubayanus phenotypic diversity was assessed in a set of 89 isolates from North America (N = 5), and South America: Altos de Lircay (N = 9), Nahuelbuta (N = 10), Villarrica (N = 10), Choshuenco (N = 10), Puyehue (N = 6), Osorno Volcano (N = 6), Coyhaique (N = 9), Torres del Paine (N = 9), Magallanes (N = 5), Karukinka (N = 9), Argentina (N = 1) (Fig 1A, S6A Table). A clustered heat map of the phenotypic correlations between yeast isolates across all traits for μmax (corresponding to the maximum specific growth rate S6B Table, Fig 6A) showed that North American Holarctic strains clustered separately with low μmax and maxOD scores (maximum OD growth recorded) across traits, while the CBS12357T strain clustered together with PB-1 strains from Torres del Paine. This clustering was further supported by the principal component analysis, both of which produced the main groups split by localities, rather than lineages, where only PB-2 isolates grouped together (S6 Fig). Specifically, only for certain phenotypes did we find differences among localities (S6C Table).
Subsequently, given the importance of S. eubayanus in lager brewing, we then evaluated the fermentation performance of the same set of strains previously phenotyped and used the S. pastorianus W34/70 control strain from lager as positive fermentation control. The conditions included 12°P wort at 12°C in 50 mL (micro-fermentations batches) and CO2 loss was recorded every day and metabolite consumption was estimated at the end of the fermentation process. In all cases, the lager control showed better fermentation performance compared to all S. eubayanus isolates (q-value < 0.05, S7A Table), yet several strains showed high CO2 release levels, demonstrating the beer fermentation potential of some strains. Overall, we observed that isolates obtained at lower latitudes (Central region) released significantly greater CO2 levels than individuals obtained at higher latitudes (extreme South, S7B Table, Pearson r = 0.566, p-value < 0.001, Fig 6B). Furthermore, we found that fermentation performance was directly correlated with maltose sugar consumption (Pearson r = 0.52, p-value < 6 x 10−8, S7C Table, Fig 6C), and ethanol production (Pearson r = 0.56, p-value < 2 x 10−6).
Our results, and those of other authors (see [19, 22]), strongly suggest that S. eubayanus preferentially associates with Nothofagus trees, particularly N. pumilio , the most cold-adapted member of this tree genus [28, 29]. Under this scenario, the biogeographical history of S. eubayanus should logically and strongly correlate with the history of Nothofagus dispersal across the globe. Interestingly, the isolation frequency of S. eubayanus was correlated with latitude. PB-1 is located at higher latitudes (and lower altitudes) and the isolation frequency of this lineage was lower than that of the other two major lineages, PB-2 and PB-3, which were easily recovered from the environment and were specifically associated with high altitude N. pumilio trees (Fig 1 ). The lower frequency of isolates found in samples from Tierra del Fuego could be due to the extreme environmental conditions found in this part of the continent, where average temperatures are below 5°C throughout most of the year . However, the widespread distribution and higher abundance of N. pumilio in Tierra del Fuego compared to northern areas may facilitate the survival, range distribution, and habitat colonization of S. eubayanus , thus increasing its population size and genetic diversity. The isolation frequency of S. eubayanus in Patagonia was three times higher than that reported in China and North America, where a similar number of bark samples and sampling sites were surveyed [23–25]. Our results are also in agreement with those reported for S. eubayanus in Argentina, where high isolation rates in N. pumilio trees were also found . The genus Nothofagus originated during the late Cretaceous-Early Tertiary interchange (ca. 135 MYA); between Southeast Asia and Australia, from a “fagalean” complex in Southeast Asia [28, 32]. At the time, Antarctica was in a more northern position connecting South America, Tasmania, Australia and New Zealand, and had a warm-humid climate (”mesothermal conditions”, sensu ). Subsequently, a step-wise wave of dispersion-colonization of Nothofagus in a westward direction reached South America. This scenario is consistent with a colonization of the S. eubayanus and S. uvarum ancestor across the South Hemisphere [23, 32].
The phylogenomic analysis presented here of wild S. eubayanus (Fig 2) reveals a higher genetic diversity for the PB South American lineages than that reported for S. eubayanus in the Northern Hemisphere [23, 25]. Indeed, our results resemble those obtained in the sister species for S. uvarum, where similar sequence divergence among S. uvarum populations was described , whereas in Chinese S. cerevisiae , a much higher genetic diversity is observed compared to other regions of the world . Interestingly, based on the levels of genetic diversity and heterozygosity, our results support the idea that PB-1 S. eubayanus from Tierra del Fuego is the most represented population in Patagonia, and likely within the species (Fig 2D). This lineage shows high levels of hybridization/introgression into northern populations. The higher nucleotide diversity of PB-1 was supported by a trend of greater genetic diversity among individuals sampled from Coyhaique, Torres del Paine, Magallanes and Karukinka sites (Fig 3B), where the dispersal of N. pumilio is also greater than in more northern regions. However, since the three different lineages were geographically segregated, it is not possible to attribute increased genetic diversity exclusively to latitude or to the nature of PB-1, as all samples isolated at latitudes higher than 45°S were consistently assigned to PB-1. Nevertheless, our dense sampling procedure, together with other findings, suggest that other lineages are not present in the extreme south, therefore, we cannot determine whether the observed increment of genetic diversity with latitude is a general pattern [20, 23]. A subgroup of PB-1 isolates is found at lower latitudes, which means that PB-1, PB-2 and PB-3 are sympatric in northern regions. Furthermore, we found discrepancies between our genetic diversity estimates and previous reports , probably because only non-admixed isolates were used in our study, while previous datasets have utilized admixed genomes, inflating the estimated genetic diversity.
This scenario is also consistent with the idea that S. eubayanus cryotolerance evolved recently, as N. pumilio (its preferred environment) became secondarily adapted to cold during the orogenesis of the Andes [28, 29]. Also, it is now accepted that the distribution of Nothofagus (and subsequently S. eubayanus ) was already established in Patagonia before the onset of the last glacial maximum (ca 20,000 years ago), when ice sheets covered most landmasses south of 43º. After this glacial period, there was massive floristic recolonization and ecological succession [29, 34]. Our results are consistent with this second scenario (colonization from peripheral glacial refugia from the South) since we found lower genetic diversity in populations located in central Chile (Altos de Lircay & Nahuelbuta) compared to populations found in southern Chile (Coyhaique, Torres del Paine, Magallanes and Karukinka). Interestingly, a different pattern is observed in the eastern side of the Andes (Argentina), where a lower number of peripheral glacial refugia occurred and most of the diversity would originate from Valleys refugia in northern sites . Indeed, in Argentina there is greater S. eubayanus genetic diversity north of 43°, where different populations congregate in a single geographic location .
Our STRUCTURE analyses provide evidence of contact and admixture between PB-1 and the other two PB lineages (Fig 2B). Overall, all of the admixed strains from Chile, New Zealand and North America contained regions from the PB-1 lineage, that together with our population genetics analysis and admixture model between populations, suggest that PB-1 could be the most widespread lineage reported to date, including admixture with the Holarctic lineage. Indeed, higher Fis values are found in the PB-1 cluster, compared to PB-2 and PB-3, suggesting that meiosis and inbreeding are more frequent in PB-1. Our admixture analysis demonstrates recurrent hybridization between ancestral lineages, supporting a scenario in which gene flow among Patagonian and Holarctic ancestral populations took place, and that is consistent with the co-occurrence of these populations in the past (Fig 5). Similar evidence of admixture in other yeast species, such as S. cerevisiae, S. uvarum, and S. paradoxus was previously reported [13–15]. Interestingly, we also detected evidence of ancestral gene flow between S. cerevisiae (or a close sister species) and the S. eubayanus HOL lineage. This demonstrates a pervasive contact between Saccharomyces populations and constant admixture, likely facilitated by their dispersal by insect’s in nutrient-rich periods of the year .
In summary, our results provide compelling evidence of the successful colonization and distribution of S. eubayanus in Patagonia, in the cold conditions of the southern hemisphere, representing most of the current extensive genetic diversity found in this species. The majority of the S. eubayanus strains collected around the world belong to the Patagonian cluster. This Out-of-Patagonia colonization has extended to the Northern Hemisphere, including a recently-found subset in China, and Oceania. Finally, our data concerning S. eubayanus, together with previous evidence in S. uvarum , leads us to propose that the ancestor of both species adapted to the environmental conditions in the Southern Hemisphere. However, the currently available data is insufficient to draw further conclusions regarding this claim, warranting future studies and especially experimental evidence for inferring local adaptation (e.g., fitness and common garden experiments, see ).
Bark samples from ‘lenga’ (Nothofagus pumilio), coigüe (N.dombeyi) and ‘ñirre’ (N. Antarctica) and Araucaria araucana were obtained aseptically from ten sampling sites in Chile (collection date, Fig 1 ): Sampling was performed as previously described . Isolated colonies were stored in glycerol 20% v/v and stored at -80°C in the Molecular Genetics Laboratory yeast collection at Universidad de Santiago de Chile. Details on the localities and the yeast isolation procedures are available in S1 Text.
We amplified and sequenced the internal transcribed spacer region (ITS) to identify colonies to the genus level . Saccharomyces species identification was conducted using the polymorphic marker GSY1 and RIP1 through amplification and enzyme restriction (see details in ). In many cases, species identification was confirmed by Sanger-sequencing of the ITS region, which was attained using a BLASTN against the Genbank database under 100% identity as threshold. DNA content was analysed as previously described .
One isolate per tree was considered for DNA sequencing. DNA was obtained using a Qiagen Genomic-tip 20/G kit (Qiagen, Hilden, Germany). The library prep reaction used was a 100x miniaturized version of the Illumina Nextera method. Samples were sequenced on a NextSeq 500/550 High Output Kit v2.5 (300 Cycles) flow cell. Reads were processed with fastp 0.19.4 (-l 37–3) [38, 39]. Reads were aligned against the Saccharomyces eubayanus CBS12357T reference genome  using BWA-mem .
Mapping files were tagged for duplicates using Picard tools 2.18.14 (http://broadinstitute.github.io/picard/). Variant calling and filtering was done with GATK version 184.108.40.206 . For all datasets, we only considered SNPs that had no missing data using vcftools option–max-missing 1. Furthermore, prior to making a dataset without missing SNPs, for gene flow analyses we excluded individuals that had more than 5% of missing sites, which included the lager strains, admixed strain yHKS212, and PB strains from Argentina. The effect of each variant was assessed and annotated with SnpEff version 4.3t , using an updated version of S. eubayanus gene annotations 
To perform phylogeny on our SNP dataset, a VCF file containing 606,656 bialllelic SNPs was converted to phylip format and used as input for IQ-TREE  to generate a maximum likelihood phylogeny with the ultrafast bootstrap option and ascertain bias correction (-st DNA -o 1105.1_Nahuelbuta -m GTR+ASC -nt 8 -bb 1000). The number of parsimony informative sites was 156,051. Trees were visualized in the iTOL website (http://itol.embl.de). For STRUCTURE analysis, a thinned VCF file was generated with vcftools 0.1.15 (—thin 1000), containing 9,885 similarly-spaced SNPs, while including only S. eubayanus strains. Structure was run five times (K ranging from 3 to 7), with 10,000 burn-in and 100,000 replications for each run. Optimal K values were obtained using structure-selector (http://lmme.qdio.ac.cn/StructureSelector/)  according to the Evanno method . The resulting plots were obtained using CLUMPAK  and visualized using structure plot (http://omicsspeaks.com/strplot2/) . We performed clustering analyses of the same samples by using SMARTPCA without outlier removal . For fineSTRUCTURE analysis , a non-thinned VCF file was phased using BEAGLE 3.0.4 . We used a constant recombination rate between consecutive SNPs based on S. cerevisae average recombination rate (0.4 cM/kbp, ). Chromosomal painting was performed with Chromopainter V2, and its output was further analysed with fineSTRUCTURE (-x 100000 -y 100000 -z 1000).
Historical and recent admixture between populations of S. eubayanus was tested with Treemix  and ADMIXTOOLS (S2 Text ) . For Treemix we analysed only S. eubayanus individuals that did not show any sign of recent admixture according to STRUCTURE results, plus the S. uvarum and a S. cerevisiae individual were kept as outgroups. In addition, we pruned out SNPs that were in linkage disequilibrium using PLINK (—indep-pairwise 50 10 0.2). We dissected the five S. eubayanus populations into subpopulations according to geographical locality and clusters obtained with fineSTRUCTURE as criteria. Treemix was first run ten times for each value of m (migration events) ranging from 1 to 6 (-noss–k 500) and two optimal m values (2 and 4) were estimated using the optM R package (https://cran.r-project.org/web/packages/OptM/index.html) (S3 Table ). Treemix was subsequently run 100 times (2 and 4 migrations, -noss–k 500) after which a consensus tree and bootstrap values were obtained using the BITE R package . We calculated f4 statistics between PA, PB-1, PB-2, PB-3, and HOL populations using the r package admixr . Admixture graph fitting of the calculated f4 statistics was done using the R package admixturegraph . Seven models were tested which were ranked according to their minimal error values (S5D Table).
We estimated π and Tajima’s D using the R packages PopGenome 2.6.0 . Values of Fst were calculated with StAMPP 1.5.1 Weir and Cockerham's unbiased estimator [60, 61] to obtain 95% confidence intervals by performing 5,000 bootstraps. LD decay was estimated by calculating R2 values using vcftools (--geno-r2 -ld-window -bp 100000), which were imported into R to calculate a regression according to , for which the half decay was estimated (Ldmax/2).
The variants of the S. eubayanus strains showing recent genetic admixture were split to bins of 100 SNPs and each bin was assigned to target populations using adegenet’s hyb.pred function . GLOBETROTTER was run using the inputs generated for fineSTRUCTURE analyses (NULL IND = 0).
The R package hierfstat  was used to calculate Fis , Hs, and Ho by using the basic.stats function. To perform a Mantel test, first the Nei’s genetic distances between subpopulations (considering localities) was calculated with the R package StAMPP .
Isolates were assembled with Spades using k from 21 to 67. To detect the non-reference material we used the custom pipeline based on the method described in . Potential lateral transferred ORFs were identified by blast search against an in-house database of 57 yeasts ORFeomes and to the currently available genomes from the yeast1000+ genome project (https://y1000plus.wei.wisc.edu/). SMART (Simple Modular Architecture Research Tool), used in GENOMIC mode, was used to identify known PFAM protein domains and homologies  using all the optional features: Outlier homologues, PFAM domains, signal peptides and internal repeats.
The microcultivation phenotyping assay of the S. eubayanus strains was performed as previously described . Briefly, isolates were pre-cultivated in 200 μL of YNB medium supplemented with glucose 2% for 48h at 25°C. For the experimental assay, strains were inoculated to an optical density (OD) of 0.03–0.1 (wavelenght 630 nm) in 200 μL of media and incubated without agitation at 25°C for 24 h (YNB control) and 48 h for other conditions in a Tecan Sunrise absorbance microplate reader. OD was measured every 20 minutes using a 630 nm filter. Each experiment was performed in quadruplicate. Maximum growth rate, lag time and OD max for each strain were calculated using GrowthRates software with default parameters . Fermentations were conducted using a 12°P high-gravity wort at 12°C in 50 mL (micro-fermentations) using a Munton's Connoisseurs Pilsner Lager kit (Muntons plc, England). 50 mL of fresh wort were inoculated to a final concentration of 15 × 106 viable cells/mL and fermentations were maintained for 14 days and weighed every day to calculate the CO2 output. At the end of the fermentation, metabolites were determined using HPLC.
We would like to thank Valentina Abarca, Wladimir Mardones and Antonio Molina for technical help and Yessica Pérez, Antonia Nespolo, Natalia Hassan and Verónica Briceño for helping us in the sampling field trips recognising Nothofagus trees. We also thank Gilles Fischer for constructive feedback on the data analysis and the manuscript. We thank CONAF and WCS Chile for allowing us sampling yeasts across the country. Finally, we thank Ginkgobioworks for generating the sequence data, and Michael Handford (Universidad de Chile) for language support.