Abstract
Abstract
MacInnis, Martin J., and Rupert, Jim L. 'ome on the range: altitude adaptation, positive selection, and Tibetan genomics. High Alt. Med. & Biol. 12:133–139, 2011. In 2010, a number of papers were published describing data from genome-wide studies designed to identify genes and genetic variants that contribute (or contributed) to human adaptation to altitude in the Himalaya. The results were exciting, intriguing, and controversial. Several genes, most notably EGLN1 and EPAS1, were identified as strong candidates for a role in evolutionary adaptation to high altitude, and the time course over which this adaptation occurred was calculated by one team to be remarkably brief. Overall, the data suggest that, at least in the ancestors of the modern Tibetans, there was a powerful selective pressure favoring variants in genes central to the molecular response to hypoxia. The most obvious manifestation of this selection seems to be the Tibetan's well known blunted erythropoietic response to hypoxemia. This article briefly reviews recent developments in ‘omic’ analysis of Tibetan highland natives, with a focus both on the answers found and the questions raised.
Background: Selection and Time
Neutral selection likely contributes to most of the genetic variation between populations, especially in cases where the population is small (or is descended from a small number of founders) and isolated (physically or culturally). Unlike directional selection, which acts on specific genetic loci (or linked regions), neutral selection can affect the entire genome. Strong and recent positive selection at one locus can lead to selective sweeps, whereby alleles linked to the positively selected locus become more prevalent (genetic hitchhiking), appearing as a positively selected haplotype dominating the population (contrast this with neutral evolution, where linkage equilibrium would degrade over time, resulting in greater genetic diversity).
The critical importance of selecting correct control populations for genetic studies is owing in part to neutral selection. Drawing conclusions about the significance of the enrichment of an allele in one population compared with another must take into consideration the background differences between the two populations. For this reason, the ideal controls are populations that only recently diverged, are relatively isolated from the population of interest, and do not share the environmental conditions of interest.
Recent Investigations of the Tibetan High Altitude Genome
The Tibetan plateau lies between 3200 and 4300 m with barometric pressures ranging between ∼450 and 525 mmHg. The principal environmental stress faced by populations living on the plateau is the reduced availability of oxygen. At 4300 m, the P
HapMap populations: CEU: CEPH Northern/Western European ancestry; YRI: Yoruba, Ibadan, Nigeria; CHB: Han Chinese, Beijing; JPT Japanese, Tokyo.
Highland Tibetans and lowland Han Chinese are thought to share a common ancestry, although the time at which the two populations diverged is a topic of both research and debate (Aldenderfer, 2011). Each of the six studies compared Tibetan and lowland Asian populations (Chinese and occasionally Japanese). Using recent ancestors to the Tibetans, the researchers tried to minimize the amount of genetic changes between the two populations that were unrelated to the Tibetans' high altitude environment. In other words, because the Tibetans represent a recent separation from the Chinese, any differences observed in the genetic structure of the two populations are more likely owing to natural selection that occurred in their respective environments. By comparing the frequency of individual variants between populations or the patterns of regional linkage disequilibrium and haplotype diversity between populations, each study attempted to identify genetic signatures of selection in the Tibetans. Several of the studies supplemented their genetic findings with tests of association with phenotypes characteristic of the high altitude Tibetans, that is, hematological characteristics. While the techniques used by each of the studies differs slightly (see Table 1), their unanimous identification of EPAS1 and near unanimous identification of EGLN1 as playing a role in high altitude adaptation strongly supports a central role of variants in these genes as contributing to the success of Tibetans (Table 2). A brief summary of these six papers follows.
To determine if there were any genetic variants that could account for the Tibetans known high altitude phenotype, Beall and colleagues (2010) used a genome-wide allele differentiation scan (GWADS) to compare genotypes at over 500,000 single nucleotide polymorphism (SNPs) between Han Chinese (using data from the HapMap database; <http://hapmap.ncbi.nlm.nih.gov/>) and Tibetans (living between 3200 and 3500 m in Yunnan province). Of the half-million SNPs tested, alleles at 8 were found to be significantly more common in the Tibetans, all of which were clustered on chromosome 2 near EPAS1,1 the gene that encodes hypoxia inducible factor 2 alpha (HIF2Aα). Many of the variants were linked (i.e., on the same haplotype), suggesting that they were moving between generations together, so selection for one could have enriched all of them in the population (i.e., a selective sweep). A subsequent screen of 103 polymorphisms in EPAS1 (including the 8 initially identified SNPs) identified 31 SNPs with a significant association between the major (i.e., most common) allele in Tibetans and lower hemoglobin concentrations, thus establishing a potential phenotype: the characteristic blunted erythropoietic response to altitude characteristic of Tibetans. An additional replication identifying associations between 32 EPAS1 variants and hemoglobin concentration in Tibetans added support to this finding.
Similarly, Simonson and colleagues (2010) compared Tibetan genome data with that of lowland Asian populations (Chinese and Japanese), but focused on variants in a subset of candidate genes that they predicted to be involved in altitude adaptation. Rather than look for individual variants, the researchers looked for chromosomal regions in or near the candidate genes in which there was limited variation, which is consistent with a selective sweep. Positive regions were then scanned for the 247 a priori candidate genes. These genes were selected from Gene Ontology and Panther databases because of involvement in oxygen homeostasis pathways or other pathways suspected to play a role in high altitude adaptation (e.g., nitric oxide metabolic processes and vasodilation). Ten genes, six of which are related to the HIF pathway, including EPAS1, the gene identified by the Beall group, were located to these regions, indicating that at least one variant in each has undergone recent and strong positive selection, presumably because of the high altitude environment. In addition to corroborating the Beall genetic results, variants in two of the genes (PPARA and EGLN1) were shown to associate with hemoglobin concentration, again supporting selection for a high altitude hematological phenotype.
Yi and colleagues (2010) took a parallel strategy to the previous studies and compared the sequences of the exome, the exomic regions of the genome (i.e., the DNA sequences that are incorporated into the final protein encoding (or self-functional) RNAs) between Chinese (again using the HapMap database) and two independent cohorts of Tibetans, with a Danish outgroup. Using population branch statistics (PBS; pairwise comparisons of allele frequencies within genes), EPAS1 was identified as the strongest candidate for natural selection and, furthermore, an EPAS1 variant much more common in Tibetans was associated with hemoglobin concentration and erythrocyte count, consistent with selection for variants contributing to differences in hematology at altitude. From this study, a total of 34 genes involved in the response to hypoxia (based on the Gene Ontology database) had greater PBS values than the genome-wide average.
Additional support for a role of EPAS1 in Tibetan adaptation came from the work of Bigham and colleagues (2010), who performed high-density (almost 1 million SNP) scans of genomes from natives of three regions of Tibet (ranging in altitude between 3000 and 4400 m). The data were analyzed in 1-megabase blocks, and 1 of the 14 blocks showing significant variation from the corresponding region in lowland East Asians (again, data from the HapMap) corresponded to the same region of chromosome 2 that contains EPAS1. A more-targeted examination of genes in the HIF and RAS pathways and the globin family identified EGLN1 as being positively selected for in the Tibetan population.
Further evidence for a role of EPAS1 and EGLN1 in Tibetan adaptation was provided by a study by Peng and colleagues (2010), who SNP-typed 50 individuals from seven Tibetan communities using the Affymetrix chip 6.0 array that was used by Bigham and colleagues (2010) and compared them to the HapMap (CHB; Chinese, Beijing) data. EPAS1 was one of many genes that underwent a selective sweep in Tibetans; and because it was identified by previous studies, Peng and colleages (2010) sequenced the EPAS1 gene in the same 50 Tibetans, showing that frequencies of 69 of 125 HapMap SNPs were significantly different between Tibetans and CHB/JPT (Japanese, Tokyo) populations. In addition, three polymorphisms from each of EPAS1, EGLN1, and PPARA were genotyped in 1334 Tibetans. Allele frequencies at all three EPAS1 SNPs, one of three EGLN1 SNPs, but none of the PPARA SNPs were significantly different between Tibetans and CHB/JPT, providing evidence for strong selection in EPAS1 and weak selection in EGLN1.
Finally, Xu and colleagues (2010) compared genome-wide allele frequencies of Tibetans and Han Chinese and identified 6 EGLN1 and 25 EPAS1 SNPs among the top 0.0001% of SNPs with an FST >0.30. Each gene was located in a region shown to have greater than expected linkage disequilibrium and reduced haplotype diversity, consistent with a dominant haplotype and evidence of a selective sweep. The dominant haplotypes of these two genes were much more common in Tibetans than other populations throughout the world. Of, interest, the frequency of the dominant haplotype of EPAS1 and ELGN1 in East Asians correlated positively with altitude.
None of the variants in EPAS1 or EGLN1 was fixed in the Tibetans. This could be owing to insufficient selection (some combination of conferred fitness and time under selection), admixture from other populations reintroducing the lowland allele, or balancing selection, which would occur if there were advantages to both variants selection established that led to a new environment-specific equilibrium. The last possibility warrants consideration when searching for phenotypes associated with the alleles. The commonality of one allele in lowlanders may mean that it confers advantage in nonhypoxia conditions.
EPAS1 and EGLN1
EPAS1 encodes endothelial PAS domain protein 1 (EPAS1), which is also known as HIF2α. EPAS1, which is induced by low oxygen concentrations, is a transcription factor that induces the expression of genes regulated by oxygen. As a member of the HIF family of transcription factors, EPAS1 is one of three alternate α subunits that dimerize with a β subunit. This heterodimer recognizes and binds the hypoxia response element (HRE) in target gene promoters inducing transcription. The β subunit is constitutively expressed, but the expression of the α subunit depends on oxygen concentration. Under normoxic conditions, EPAS1 is presumably hydroxylated at specific proline residues by EGLN1, EGLN2, and/or EGLN3. Proline hydroxylation increases the interaction between EPAS1 and the von Hippel–Lindau protein (VHL), which increases ubiquitination and degradation. Hypoxic conditions, on the other hand, impair hydroxylation, ubiquitination, and degradation, stabilizing EPAS1. Thus, the protein products of EPAS1 and EGLN1 are part of the pathway that allows cells to regulate gene expression in response to oxygen concentration. Their relatively upstream position in the molecular oxygen-sensing pathway is strong support for their potential role in genetic adaptation to altitude: seemingly minor changes in the activity or expression of these proteins may have widespread (pleiotropic) effects on the phenotype. Multiple phenotypes can result from mutations in regulatory systems that mediate the expression of multiple genes. Whether this is the case for EPAS1 is unknown, as hemoglobin concentration, oxygen saturation, and erythrocyte count were the only phenotypes tested. Successful altitude adaptation in the Tibetans may be owing to the additive effect of a number of small advantages, each manifesting the EPAS1 variant in its own respective pathways. This is in keeping with models of evolution that postulate that many adaptive changes are polygenic in origin (i.e., result from a number of minor changes in a variety of genes).
Other highland genomes
The Tibetan plateau is not the only highland region colonized by humans, and there is evidence that a number of different strategies have evolved in human populations to deal with environmental hypoxia. There is a strong consensus in the recent genomic studies of Tibetans that variants in EPAS1 and EGLN1 play(ed) a role in adaptation in that population, raising the question of whether the same genetic variants are active in other highlanders. In addition to their Tibetan cohort, Bigham and colleagues (2010) also interrogated the genomes of two South American indigenous high altitude populations: the Quechua and Aymara. The Andean altiplano, which ranges between 3200 and 4500 m, has been occupied for at least 12,000 yr, and native Andeans are thought to be well adapted to altitude, albeit in ways different from their Himalayan counterparts. Bigham and colleagues (2010) identified more chromosomal regions showing evidence of positive selection in the Andeans than in the Tibetan (37 vs. 14), with none of the regions shared. These data suggest that the genetic basis of attitude adaptation is dissimilar in the two populations; however, analysis at the single-gene level in selected pathways (HIF, RAS, and globin) detected evidence for selection of EGLN1 in the Andeans. Given that the populations likely diverged at least 20,000 yr ago and that maintenance of a shared high altitude genotype during the migration of the Andean's forbearers across Beringia and south across the North American plains (or along a now-drowned western coast) seems unlikely, EGLN1 variants may have been selected for independently in the two populations upon their initial advance into the high plateaus. Further evidence to support this point stems from their finding that each population has a single but unique dominant haplotype in the region surrounding EGLN1. The dissimilarity between the specific variants ostensibly selected for in the two populations argues that the functional variant(s) has yet to be identified and, because the two populations seem to differ phenotypically, the issue of whether EGLN1 contributes to both Andean and Tibetan adaptive strategies in some way or whether both strategies manifest in the New World is a question worth addressing. If there are multiple ways to deal with environmental hypoxia, there is no reason to expect that only one would be present in a population. Whether Andeans with low hematocrits are more similar to Tibetans than to other Andeans at the putative adaptive loci (or the corollary, that Tibetans with high hematocrits are more Andean-like) could be a fruitful line of investigation.
HIF1A, the missing gene
None of the studies detected any variants in HIF1A, the gene that encodes hypoxia inducible factor 1α (HIF1α) as having been a target of selection (Tissot van Patot and Gassmann, 2011). HIF1α is thought to be the central mediator of the hypoxia response and, as such, would be anticipated to contribute to an altitude-adapted genotype. One explanation may be that HIF1α controls core responses that have limited latitude for variation and thus remains (relatively) functionally immutable, whereas HIF2α serves the role of the adaptable regulator. The concept that gene duplication provides redundant copies of a gene on which selection can act is a central tenet of molecular evolutionary theory (Wills, 2011). Both HIF1α and HIF1β are in the same family and have similar function. Selection acting only on HIF2α provides some clues to the favored phenotype(s). The hematological data presented in Beall and colleagues (2010) suggest that lower hematocrit may be part of the adapted phenotype. High hematocrit increases blood viscosity that in turn forces the heart to work harder to maintain cardiac output, thereby increasing the risk of cardiac failure. Most populations maintain higher hematocrit under hypoxic conditions, which, if excessive, can have pathological consequences, such as the development of chronic mountain sickness (CMS), a condition largely unknown in Tibetans but relatively common in Andeans. A blunted altitude-induced erythropoiesis might be beneficial to highlanders, but it seems to be an insufficient advantage to drive selection at the rate it appears to have occurred in the Himalayas (Aldenderfer, 2011). Polycythemias (including CMS) can be fatal, but hematocrit values observed in Andeans, while high, do not tend to reach levels that are associated with pathological consequences, making hematocrit an unlikely cause for the substantial differential mortality needed to drive rapid selection in this population. Although CMS is undoubtedly a clinical problem in the Andes, the relatively late onset of the condition would likely limit its selective effect. We are not aware of data demonstrating an effect of moderately high hematocrits on reproductive success, which could exert strong selective pressures in utero. Even if there is not strong selection for a blunted hematopoietic response in Andeans, selection for hematocrit in the Tibetans is not precluded; the Andeans could have some compensatory mechanism to offset the hematological consequences of elevated hematocrit. HIF1α is expressed in all cell types, but EPAS1 (HIF2α) expression is specific to certain cell types, including placental and embryonic–foetal tissue. There is substantial literature (both historical and scientific) that implicates hypoxia in reproductive health problems [e.g., differential fecundity is a powerful driver of selection, and in their paper, Beall and colleagues (2010) mention the potential effect of EPAS1 variants in utero as a possible altitude phenotype; EPAS1 is known to regulate gene expression in the placenta. Again, the model needs to account for the success of the Andeans; however, there is no reason to believe that there is only one potential winning strategy or, for that matter, that there are not multiple strategies within a population. Further studies on the intrapopulation and interpopulation similarities and differences between Andeans and the Tibetans could be guided by results of the genomic papers that were published this year.
Conclusions
Reproducibility is the cornerstone of scientific surety. The six studies of the Tibetan genome that were published in the latter half of 2010 identified genes that are involved in the molecular response to hypoxia as being candidates for the high altitude genotype. Modifications upstream in complex pathways can have multiple physiological and anatomical ramifications, so the high altitude phenotype may be a complex, interrelated suite of minor adaptations that act in an additive way to blunt the effect of environmental hypoxia. The strong association between the EPAS1 genotype and hemoglobin suggests that a blunted hematopoietic response to hypoxia may be one component of this phenotype. The role of the same gene in reproduction and development may mean that there is a uterine phenotype as well, perhaps in both mother and embryo. Resolving the duration over which selection for these characteristics could have occurred would do much to illuminate the selective pressure favoring their acquisition, and parallel studies in other successful highland populations could provide insights into the alternate genes and mechanisms on which selection has acted to allow humans to successfully colonize the high places on the planet.
Footnotes
Disclosures
Mr. MacInnis is the recipient of a UBC Four Year Fellowship and a UBC Affiliated Fellowship. Dr. Rupert has no conflicts of interest or financial ties to disclose.
1
By convention, human gene abbreviations are in italicized capitals (see <http://www.genenames.org/guidelines.html>).
