Abstract
Domestication has fascinated researchers starting with Charles Darwin who was the first to observe that domestication-related traits are similar across a variety of taxonomically distant domesticated species yet are very different for taxonomically close wild species. The genetic basis of domestication remains to be understood and there are at least three different hypotheses trying to explain the genetic mechanisms underlying domestication phenomenon. This commentary briefly reviews these hypotheses and the existing compelling evidence that mobile genetic elements contribute to the phenomenon of domestication.
Domestication
Domestication of plants and animals is the most important factor in the appearance of agriculture, and in the subsequent rise of the agricultural human civilization, which is still the major type of human society on earth. Loss of biodiversity in general, and the biodiversity of livestock in particular (FAO, 2007), along with global soil depletion (Guo et al., 2010; Hazell and Wood, 2008), accompanies the rise and progression of agricultural civilization. The thorough understanding of these phenomena requires development of novel methods and approaches in agricultural sciences. This motivation could explain the increase in total volume of research concerning domestication processes. In her recent review, Melinda Zeder (2015) pointed out that only in 2013 were there 811 papers published in 350 different journals, including 42 papers in Proceedings of the National Academy of Sciences (PNAS) that took into consideration domestication-related questions. Despite the ever-growing high interest in domestication process, which was probably ignited by Charles Darwin (1951) who defined domestication as an accelerated evolution under artificial selection, the rigorous definition of domestication is not yet introduced and the genetic mechanisms behind the process are yet to be discovered. As a working definition of the term ‘domestication’, it was suggested to use the following: … Domestication is a sustained multigenerational, mutualistic relationship in which one organism assumes a significant degree of influence over the reproduction and care of another organism in order to secure a more predictable supply of a resource of interest, and through which the partner organism gains advantage over individuals that remain outside this relationship, thereby benefitting and often increasing the fitness of both the domesticator and the target domesticate … (Zeder, 2015)
Domestication syndrome
A typical characteristic of domestication is the so-called phenomenon of ‘domestication syndrome’, a set of phenotypic traits such as docility and tameness, coat color changes, reduction in tooth size, changes in craniofacial morphology, decrease in sensitivity of auditory, vision and olfactory abilities, reduction in total brain size, neoteny (accelerated sexual maturity), decrease in sexual dimorphism (feminization), increase in docility and tameness, more frequent estrus cycles, and several others (see Wilkins et al., 2014, for the full list). Properties that are negatively correlated with domestication were also described (Diamond, 2002). Notably, these domestication-related traits are similar across a variety of taxonomically distant domesticated species, yet, in contrast, are very different for taxonomically close wild species. Detailed study of these traits in mammals was done by Bogolubskij (1959), Paterson et al. (1995), and Tang et al. (2010).The reasons, etiology, and mechanisms behind the domestication syndrome are also largely unknown. Not all of these experimentally found features are present in all domesticated species, but many of them are present in each one to some extent. It was suggested that the domestication syndrome results predominantly from mild neural crest cell deficit during embryonic development (Wilkins et al., 2014; Wright, 2015).
It should be noted that the search for genetic basis of domestication is mostly employing species-specific gene ensembles that are different between domesticated and closely related wild animals, like, for example, in pigs – genes that are associated with eating behavior and dental changes(De Simoni Gouveia et al., 2014; Evin et al., 2015; Moon et al., 2015); in horses – genes that are associated with lipid metabolic process, neurological system process, muscle contraction, ion transport, and some others (Metzger et al., 2014); and in cattle – genes that are associated with eye morphology, coat color, statue, polledness, ecologic adaptation, and some others (Porto-Neto et al., 2013; Ramey et al., 2013). Apparently, in most of the aforementioned cases, the phenotypic features and corresponding gene ensembles are related to artificially selected traits that increase the species’ agricultural value. Despite intense studies in this area, the increase in copy number of immune system genes is still the only confirmed universal feature of domestication found in genomes of many domesticated species (Ghosh et al., 2014; Liu et al., 2010; Revay et al., 2015).
Comparisons of domesticated and closely related wild species
Dimitry K Belyaev, in Novosibirsk, in the country which back then was called Soviet Union, initiated the biggest and most ambitious domestication study project. After many years of research, he found that the universal feature of domestication is the increasing animals’ docility toward humans. The experimental protocol involved intensive selective breeding of foxes solely for increasing docility and tameness, and after almost 50 years of selection, it led to the appearance of foxes with phenotypic traits typical of the domestication syndrome (e.g. reduced tooth size, floppy ears, shorter snout). Comparative gene expression analysis has shown that genes, influencing the hypothalamic–pituitary–adrenal system, were involved in these processes (Wilkins et al., 2014). Several other behavioral-related genes were shown to be involved in domestication processes, discussed in Albert et al. (2012) and Singh et al. (2017). In these studies, the comparative gene expression analysis in brain for pairs dog and wolf, pig and wild boar, and domesticated and wild rabbits has shown some differences in gene expression; however, these differences were specific for each compared pair. As a result, the authors have suggested that similar behavioral changes in domesticated animals could be caused by different ensembles of metabolic genes and pathways involved.
In our studies, we demonstrated that the universal difference between domesticated and closely related wild species is that the first group has an increase in polymorphism of enzymes for exogenous substrate metabolism (that connects animal’s metabolome with environmental substrates), and the second group has an increase in polymorphism of enzymes for cellular energy metabolism, such as glycolysis, pentose phosphate pathway, and Krebs cycle (Glazko et al., 2015). The first type of polymorphism allows adaptation to the wide spectrum of substrate specificities, and in the second type, intercellular energy supplies are optimized in a narrow spectrum of substrates. Recently, we implemented a comparative polymorphism analysis of 30 different protein group loci in domesticated and closely related wild representatives of two animal orders – Artiodactyla (even-toed ungulates) and Perissodactyla (odd-toed ungulates), consisting of wild zoo species, free living and breeding in a biosphere reserve (sanctuary) Askania-Nova, and several livestock breeds (cattle, swine, sheep, and horse) from different Russian and Ukrainian farms (26 breeds and within breed groups). In total, 12 different animal species were considered. Population genetic characteristics of differentiation in 18 soybean species (Glycine max) of different countries, as well as in three populations of wild soybean species collected in various regions of Far East, Russia (Soja Glycine ussuriensis Moench – presumably an ancestor of cultural soybean species), were also studied. On average, the polymorphism level (proportion of polymorphic loci) was slightly higher in domesticated species of plants and animals compared with their wild counterparts. In domesticated animal species, the level of polymorphism varied between 0.036 (swine) and 0.171 (cattle) and in wild animals between 0.017 (Equus quagga boehmi) and 0.135 (Taurotragus oryx). Also, these groups were readily distinguished based on involvement of different genetic and biochemical systems in the observed polymorphism. Proportions of polymorphic loci for intercellular energy metabolism enzymes (averaged by the total number of species considered) were 0.179 in domesticated Bovidae and 0.629 in wild animals; for exogenous substrate metabolism enzymes were 0.464 and 0.193; and for protein transporters were 0.357 and 0.178, respectively.
Sub-genome of domestication
Thus, domesticated and closely related wild species are readily distinguished based on different functional protein groups’ contribution to the total degree of polymorphism, and this is in agreement with the hypothesis that speciation involves reorganization of cellular energy supply mechanisms and that artificial selection does not result in the appearance of new species, except artificial interspecific hybridization. Therefore, one can expect that natural selection supports speciation favoring polymorphism of enzymes for cellular energy metabolism, while artificial selection supports new forms readily employing a plethora of different exogenous substrates. Presumably, extensive phenotypic plasticity in domesticated species correlates with different metabolic rates of different exogenous substrates. Our data point to the existence of a ‘sub-genome’, with variability, essential for extensive phenotypic plasticity characteristic of domesticated species and also agree with the balanced polymorphism theory. Apparently any kind of directional selection can be implemented given the variability of this sub-genome.
However, the most intriguing and the important observation we made was that the scale of genetic variability was comparable in domesticated and closely related wild species. Moreover, in some cases, genetic differentiation between different breeds was larger than between closely related wild species. Given widely accepted point of view that inbreeding, leading to decrease in genetic variability, is a more frequent event in domesticated compared with closely related wild species, these data were quite unexpected. However, similar data were also obtained by other investigators (Wiener and Wilkinson, 2011). In the context of phenotypic variability, it should be noted that the number of breeds with widely different phenotypic characteristics in five traditional agricultural species (goat, sheep, cattle, horse, and swine – more than 5000 breeds in total) is comparable with the number of extant mammalian species (around 5000 species) (FAO, 2007).
The accumulated evidence describing genotypic and phenotypic genetic variability suggests that the only way to clarify the genetic bases of domestication is to find out the source of the characteristic genetic variability that differentiates between domesticated and closely related wild species.
It should be noted that there were many attempts to domesticate a wide variety of different plants and animals throughout human history; however, the agrarian civilization rests upon only few plant and animal species: livestock among animals, rice and grain among plants, as indicated by J Diamond (2002). In this particular work, J Diamond has presented species properties that interfere with domestication. It would be only logical to assume that there must be also those favoring domestication and, presumably, related to increased genetic variability that allows the balance between artificial and natural selection to create this spectacular species diversity that readily distinguishes domesticated animals from their closely related wild counterparts.
Current hypotheses of domestication mechanisms
Currently, there are three major hypotheses explaining domestication mechanisms: the hypothesis of neural crest cell deficits during embryonic development, the hypothesis of general regulatory network that resulted from selection of tameness alone, and the hypothesis of pleiotropy, that is, selection of major genes affecting different processes, contributing to domestication syndrome (Wilkins et al., 2014; Wright, 2015). However, these hypotheses fail to explain high genetic variability of domesticated species. It was suggested that this high genetic variability can be the result of frequent breeding between domesticated and closely related wild species (Larson and Fuller, 2015). But, in this case, it is still unclear why this hybridization led to a different rate and ‘deepness’ of domestication and what is the difference between domesticated animals with many breeds (e.g. livestock animals) and half-domesticated animals, almost without any phenotypic variability such as camels and yaks. In addition, one of the obvious differences between species with many breeds and half-domesticated animals is the ability of the former for colonizing different territories, defining the range of geographic distribution and reflecting the ability to adapt to different eco-geographical conditions, demonstrating high plasticity toward not only artificial but also natural selection.
Role of mobile elements in domestication
In our previous studies, we demonstrated that the genomes of domesticated and closely related wild animals are different based on DNA fragment frequencies, flanked by short inverted decanucleotide repeats, microsatellites. Data that we collected led us to assume that retroviral infections can, to some extent, induce high population and genetic variability of domesticated species (Glazko et al., 2015; Glazko and Glazko, 2013). Exogenous retroviral progeny influences significantly endogenous retroviral families that constitute a large part of mammalian genomes’ dispersed repeats (Garcia-Etxebarria and Jugo, 2010; Glazko et al., 2012; Van der Kuyl, 2011).
There are several databases describing endogenous retroviral families in genomes of domesticated mammals (Garcia-Etxebarria et al., 2014). Examples of horizontal retrotransposon transfer, that are simultaneously present in many taxonomically distant species, are accumulating (Oliveira et al., 2012; Walsh et al., 2013), and the significance of horizontal retrotransposon transfer in mammalian evolution is widely discussed (Chalopin et al., 2015) There is also some evidence that viruses and mobile elements drive evolutionary transitions (Koonin, 2016).
By now, it is well known that retrotransposons can contribute to microsatellite sequences (Ahmed and Liang, 2012; Behura and Severson, 2013; Sharma et al., 2013). In our own research, it was shown that the frequency of recombination products in genomic DNA fragments in genomes of horse and cattle, flanked by inverted repeats of microsatellites’ partial loci, is highest for retrotransposons (Bardukov et al., 2014; Glazko et al., 2015).
In the recent years, a new concept for phenotypic variability was born. This concept considers phenotype as a result of interaction between ‘genetic messages’ (nucleotide sequences) and other factors, influencing the release of genetic information (conditions of living and breeding, microbiomes, pollutants, and pathogens). It is assumed that endophenotype is formed through interactions of different levels of genetic makeup, that is, transcriptome, proteome, and metabolome, with nonlinear connections between all levels (e.g., single change in a transcriptome may result in many changes in a metabolome and vice versa), and each level, in turn, is influenced by environmental factors (Ibeagha-Awemu and Zhao, 2015; Te Pas et al., 2017).
Epigenome is formed by DNA methylation, histone modifications, chromatin remodeling, and by other agents of epigenetic marks, such as non-coding RNAs, for example microRNAs.
It is already known that there are genes and gene networks that are regulated significantly differently in highly productive cattle breeds compared with their ancestral forms, mostly because of the different microRNA regulatory targets in more than 1600 genes participating in metabolic pathways and immune response. Expression profiles of microRNA regulating genes, involved in immune response at different stages of cattle’s lactation, were recently obtained (Do et al., 2017; Edwards et al., 2015).
Mobile genetic elements are a source and vehicle of microRNA distribution in a genome (Piriyapongsa and Jordan, 2007; Qin et al., 2015; Roberts et al., 2014; Smalheiser and Torvik, 2005). Similarly, mobile genetic elements have been proposed to be the major contributor to the regulation and diversification of vertebrate long non-coding RNAs (Hadjiargyrou and Delihas, 2013; Kannan et al., 2015; Kapusta et al., 2013). Interestingly enough, in a recent publication it was shown that the potential targets of two lncRNAs, differentially expressed between the domestic silkworm (Bombyx mori) and its wild counterpart (Bombyx mandarina), were mainly enriched in the processes of silk protein translation (Zhou et al., 2017). Given that under long-term artificial selection the domestic silkworm has increased its silk yield tremendously compared with its wild counterpart implies that those lncRNAs were involved in the domestication process through the post-transcriptional regulation of silk protein (Zhou et al., 2017). Gradually it is becoming clearer that DNA motifs, regulating different metabolic pathways, can shed light on connection between phenotypic variability of domesticated animals, their environment, and a variety of their retroviruses. One could expect that high phenotypic variability of domesticated animals is caused by the unique ability to accumulate mobile genetic elements – product of exogenous retroviruses – and their primary involvement in genomic rearrangements.
It is well known that approximately 50% of nucleotide sequences of cattle genomes are represented by dispersed repeats, mostly retrotransposons (Elsik et al., 2009). Using Repeat Masker database of mobile genetic elements and Integrated Genome Browser software, we analyzed the organization of mobile genetic elements and their recombination products in the first chromosome of cattle nucleotide sequences (13,436,028 bp) (Glazko et al., 2017). In this region, SINE/tRNA-Core-RTE, LINE/RTE-BovB, LINE/L1, and LTR/ERV were the most frequent elements. Their mutual localization was nonrandom. The most frequent associations consisted of two SINE and LINE family members, SINE/tRNA-Core-RTE and LTR/EVR, namely, LTR/ERVK)/LINE/RTE-BovB and LTR/ERVK/LINE/L1. The last variant serves as a basis for three-member clusters – (LINE/RTE-BovB)/(BTLTR1)/(LINE/RTE-BovB) and (LINE/L1)/(BTLTR1J)/(LINE/L1) – while other retrotransposons did not form such clusters. The density of those clusters was higher toward the distal part of first chromosome.
Using Integrated Genome Browser software, the co-localization of three-member products of recombination between LINE and LTR ERV and structural genes was studied. It appeared that 34 of these products were located inside 12 genes (and the rest in intergenic regions), while 10 and 12 copies of them were located in two genes – grik1 and арр, functioning in central nervous system pathways in mammals. The fact that in both genes the three-member construct (LINE/RTE-BovB)/(BTLTR1)/(LINE/RTE-BovB) was represented by nine copies and only one and three copies of construct (LINE/L1)/(BTLTR1J)/(LINE/L1) were found inside grik1 and арр, respectively suggests that those genes are ancient targets of mobile genetic element insertions. The patterns of LINE and LTR ERV recombination product distribution over the first chromosome and their specific location inside structural genes suggest they may contain special functional elements.
Because microRNAs can contribute to epigenetic variability and parts of them can be formed out of mobile elements, we searched for consensus elements, homologous to known microRNAs in these recombination products (Skobel et al., 2017).
The search for functional elements in the construct RTE-BovB/BTLTR1/RTE-BovB, located in 12 structural genes (kcne2, gart, tmem50b, il10rb, ifnar2, urb1, grik11, usp16, ltn1, cyyr1, app, jam2) in the 13,436,028-bp fragment of the first cattle chromosome had shown that there was indeed a consensus sequence with several functional elements. All RTE-BovB/BTLTR1/RTE-BovB constructs found there were between 399 and 3016 bp in length and had high percentage of homology (no less than 78.32%). They were all located in the aforementioned genes’ introns. Genes with this construct (kcne2, gart, tmem50b, il10rb, ifnar2, urb1, grik1, usp16, ltn1, cyyr1, app, jam2) belong to evolutionary conserved group of genes, present in cattle chromosome 1, human chromosome 21, mouse chromosome 16, opossum chromosome 4 (excluding cyyr1), and rabbit chromosome 14.
The 266-bp consensus sequence found in the construct was a subsequence of LINE BovB (98.5% of homology) and includes 100 microRNAs of 47 different species (plants and animals) including Rhesus lymphocryptovirus microRNA rlcv-miR-rL1-5-5p. Of these microRNAs, 44 (44%) were from the family miR-30, 21 microRNAs were similar to miR-30a-5p, and 13 microRNAs were similar to miR-30e-5p (also found in cattle, bta-miR-30a-5p, and bta-miR-30e-5p). The family miR-30 regulates many biological processes, for example, muscle mass development, milk production, and stress and immune response in cattle, and was shown to be involved in the development of several psychiatric diseases and various cancers.
In this work, for the first time it was shown that the long terminal repeat of cattle-specific endogenous retrovirus BTLTR1 has a fragment with high homology to BovB that is well known for its participation in horizontal genetic material transfer.
Over the same fragment of chromosome 1, we found that the frequency of the construct (LINE/L1)/(BTLTR1J)/(LINE/L1) was three times less as compared with that of RTE-BovB/BTLTR1/RTE-BovB (110 times vs 382 times). The construct (LINE/L1)/(BTLTR1J)/(LINE/L1) was co-located with only two genes, the same genes where RTE-BovB/BTLTR1/RTE-BovB was also found – Glutamate Ionotropic Receptor Kainate Type Subunit 1 (grik1) and Amyloid Beta Precursor Protein (app).
Summary
In summary, we found that the constructs RTE-BovB/BTLTR1/RTE-BovB, located in 12 structural genes in the 13,436,028-bp segment of first cattle chromosome, have important regulatory sequences influencing productivity characteristics in cattle breeds that can explain their conservative presence in the genome and once again confirm the hypothesis that mobile elements are involved in genome evolution.
In addition, we note that the evolutionary conserved linkage of the genes with RTE-BovB/BTLTR1/RTE-BovB recombination products and their involvement in many different biological processes are in agreement with the ‘pleotropic’ domestication hypothesis, that there are loose clusters of multiple major genes throughout the genome, and in each of these clusters, linked genes surround a pleotropic core, influencing different domestication syndrome traits (Wright, 2015). Our studies add to the hypothesis that those loose gene clusters, surrounding pleotropic core favoring domestication, also include parts of mobile genetic elements with regulatory motifs, influencing a variety of metabolic processes, thereby providing a basis for high plasticity toward not only artificial but also natural selection of domesticated animals.
Footnotes
Funding
Support has been provided in part by the NIH IDeA Networks of Biomedical Research Excellence (INBRE) grant P20GM103429 and by Center for Translational Pediatric Research (CTPR) NIH Center of Biomedical Research Excellence award P20GM121293. None of the funding bodies had a role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
