Abstract
The Holocene Epoch (11,700 years ago to the present) marks the development of the present-day boreal ecosystems in interior Alaska. The composition and genetics of soil microbes have the potential to alter how nutrient cycling and vegetation respond to warming and cooling events, but very little is known about how boreal soils have varied over time. Here, we use DNA sequencing on both modern soils and well-preserved paleosols developed during several episodes of the Holocene to extract information on soil bacteria, archaea, and fungi present in interior Alaska during the past 8000 years (8 ka). Community composition of bacteria and fungi in the ancient paleosols was different from modern soils, with a higher relative abundance of Proteobacteria, Chloroflexi, and Verrucomicrobia in the modern soils. The most dramatic shift in interior Alaska’s soil microbiome occurred ca. 7 ka, when species diversity was lowered and functional diversity became higher after 7 ka. This suggests that function was truly low, the early Holocene ecosystems were functionally redundant, and/or that true functional diversity was not captured due to a lack of genetic resolution in existing sequence databases. The cause of this observed shift cannot be directly answered in this work; however, these data suggest that ca. 7 ka was a critical period when microbial taxa and function shifted dramatically. This is important for understanding how the soil microbiome responds to climate changes and impacts the succession of vegetation.
Introduction
Parts of interior Alaska have a unique Holocene sedimentary record because of historical and ongoing deposition of glacial-derived silt deposited by wind, or loess, across the landscape(Muhs et al., 2003). During glacial advances, windy periods, and/or times of cold and dry climate, loess deposition was enhanced north of the Alaska Range, which was fueled by southerly katabatic winds that reworked silt exposed on the floodplains of pro-glacial streams. During warm, wet and/or calm episodes of the Holocene, loess aggradation slowed and soil development proceeded. The stratigraphic result of these processes in many upland exposures is alternating loess-paleosol sequences laid down over the last 14,000 years (Muhs and Budahn, 2006; Muhs et al., 2003). Sedimentary profiles that expose multiple paleosols can therefore provide ages on these paleoclimatic transitions and millennia of biogeochemical information on past environments. Paleosols are unique as they acquire new chemical and biological inputs which serve as indicators of vegetation and climate during their formation (Höfle et al., 2000). Through time, older paleosols are buried and retain these indicators, which can serve as a record of local temporal environmental conditions (Gaglioti et al., 2018). A critical and often overlooked component of the ecosystem includes microbial communities detected in the paleosols.
Despite indications of major vegetative shifts in this region during the Holocene (Bigelow et al., 2003; Gaglioti et al., 2018), our understanding of bacteria, archaea, and fungi present in paleosols is limited. Recently, microbial communities have been characterized from paleosols in Iran (Frindte et al., 2020) and the western Alps (Dandare et al., 2019; Young et al., 2019). Microorganisms serve a primary function in the ecosystem by cycling carbon and nitrogen (Naeem et al., 2000; Zak et al., 2003) that are required for higher order organisms to grow and proliferate. Alterations to nutrient cycling as a result of the changing climate and in turn microbial communities are particularly important in boreal systems (Mekonnen et al., 2018; Melvin et al., 2015). Developing an understanding of ancient microbial ecosystem inputs and comparing these with modern processes would provide insight into how the biochemical processes we know today have behaved over long time periods with varying climates. For example, plant exudates feed rhizosphere microbes and protect against other microbial competitors and, in turn, rhizosphere microbes protect plants from pathogenic bacteria and fungi, and provide critical nutrients (Bais et al., 2006; Grayston et al., 1997). Therefore, shifts in microbial diversity and function suggest long-term changes in soil nutrient cycling and vegetation processes in a warmer world. It is also important to note that microorganisms are stable in time and space (Baldrian et al., 2012) and there exists a tipping point when the microbial communities shift to an alternate stable state (Allison and Martiny, 2008; Saidi-Mehrabad et al., 2020).
Interior Alaska and Beringia have unique field sites to study the dynamics and effects of the soil microbiome because it was largely unglaciated during the Quaternary period and several, dramatic ecological transitions have occurred there since the Holocene began (Fritz et al., 2012; Hopkins, 1982). Primarily reconstructed from fossil pollen assemblages, key ecozones that covered this vast region since 11.7 ka include shrub tundra, deciduous woodland, white spruce forest, and finally black spruce-dominated forests. Changes in climate and forest migration lags were likely the main cause of these shifts, but relatively little is known about how changes in boreal forest soils may have responded to or affected vegetation patterns in this region. In addition to its rich Holocene history, Alaska is now on the front lines of climate change. Therefore, knowledge of how past ecosystems shifted during a changing climate may help explain how current ecosystem function may change under an amplified warming scenario (Chapman and Walsh, 2007; Douglas et al., 2014).
Given how microbial activity is known to affect ecosystem-level processes and the limited information available on microbial populations from paleosols representing the Holocene, we sought to characterize and compare microbial community composition in Alaskan sites representing modern soils to a series of paleosols representing periods of soil development occurring there in the past 8000 years. These sites are particularly interesting because there is evidence of early humans in interior Alaska. We used prior studies on vegetation to better understand major shifts in above-ground ecology near the study sites. The soil microorganisms in our analyses included both living and dead cells present in the soils.
Methods
Sample collection
Soil samples were collected aseptically from modern (surface soils) and ancient (paleosol horizons representing 8000–400 calBP) soils in archeological sites located in Alaska in October of 2015 that were age dated in 2015 and 2016. Soils representing modern soils were sampled from two sites, AK-1 (63°54′31″N and 145°33′24″W) and AK-2 (63°54′50″N and 145°35′18″W), located southeast of Delta Junction on an ancient glacial moraine (Figure 1). All personnel were wearing masks, gloves, and suits to reduce contamination (Figure 1b). Soil was collected by removing the first cm of soil exposed to the atmosphere and aseptically collecting the soil beneath using a shovel cleaned with alcohol, DNA AWAY™ solution, and RNase AWAY® solution. The soil was immediately placed into a sterile whirlpak bag. Three replicate surface soil samples were collected from AK-1 and AK-2. The third site, AK-3 (63°49′00″N and 145°56′46″W), was located on a 50 m high bluff overlooking the Delta River to the southwest. The bluff was composed of windblown silt and sand punctuated by a series of paleosols ranging from approximately 400 to 8000 calBP (Figure 1; Potter et al., 2018). The vegetation immediately above the bluff was composed of grasses (Leymus sp. and Festuca sp.), forbs (Dryas sp., Artemisia sp., and Potentilla sp.) and sedges (Carex sp.) with a spruce (Picea glauca) and poplar (Populus basamifera) overstory. Triplicate soil samples were collected horizontally from six discrete areas within the bluff representing a range of ages within the late-Holocene. At each area, the soil exposed to the atmosphere was removed. Using a stainless steel trowel cleaned with isopropanol, DNA AWAY™ solution (Molecular Bioproducts, Inc., San Diego, CA), and RNase AWAY® solution (Molecular Bioproducts, Inc., San Diego, CA), the soil behind was collected and immediately placed into a sterile whirlpak bag. Following collection, the soil samples were immediately placed on ice and stored separately in the dark at 4°C or −80°C for soil properties and DNA extraction, respectively. Soils were then shipped and stored at the same temperatures at the Cold Regions Research and Engineering Laboratory (CRREL) in Hanover, NH, until processing.

(a) A map of Alaska and a location map showing the three samples sites (marked by white circles) in this study (Imagery DigitalGlobe 2018, Map Data 2018 Google near Fort Greely, Alaska. https://www.google.com/maps/ [January, 2018]), (b) images from each site location, and (c) image of bluff at site AK-3.
Soil properties
Soil gravimetric water content (GWC), pH, and organic matter (OM) content were measured using standard procedures (Schulte and Hoskins, 2009; Sims and Eckert, 2009; Wolf and Beegle, 2009). In brief, for GWC the wet mass of a subsample of each sample was obtained prior to drying at 105°C for 24 h to obtain the dry mass, with the difference in mass being the water content of that sample. Soil pH values were obtained by adding approximately 10 g of soil to make a 1:1 soil:water slurry of each sample in a clean vial, manually mixing well for approximately 10 s, allowing the soil particles to settle for approximately 10 s and measuring the pHw using a pH probe (Hanna Instruments) connected to a Mettler Todelo meter (Seven Easy). Following this, pHs was obtained by adding enough of a concentrated calcium chloride solution to obtain a 0.1 M salt concentration in the vial and measuring the pH again. Soil organic matter content was obtained by heating a known mass of oven dried soil to 360°C for 2 h, cooling in a desiccator, and measuring the mass again, with the difference in mass being the carbon driven off by heat. Significant differences between soil properties were assessed through analysis of variance using Tukey’s All Pairs in JMP (Statistical Analysis Software). The assumptions of the analysis of variance were tested in R (R Core Team, 2014) using the vegan package (Oksanen et al., 2013). When the assumptions of variance (Levene’s) and normality (Shapiro-Wilks) could not be met, the Kruskal-Wallis test was used.
DNA extraction and sequencing
Genomic DNA was extracted from each triplicate soil sample using the PowerSoil® DNA Isolation Kit (MoBio) and quantified using a Qubit 3.0 Fluorometer (Thermo Fisher Scientific) using the HSDNA kit. For each DNA extraction, extraction blanks served as negative controls. We followed two sequencing approaches: targeted and shotgun sequencing. For the targeted sequencing, the V4 region of the 16S rRNA gene (with primer pair 515F-806R; Caporaso et al., 2010, 2011) for prokaryotes and the internal transcribed spacer (ITS) region (with primer pair ITS1f-ITS2, EMP.ITSkabir) for fungi were amplified from triplicate DNA samples as described in Smith and Peay (2014). Amplicons were sequenced with a MiSeq sequencer (Illumina) at the Environmental Sample Preparation and Sequencing Facility at Argonne National Laboratory using the paired end protocol. The PCR run included a denaturation step at 94°C for 3 min, thirty-five 45 s cycles at 94°C, a 60 s cycle at 50°C, a 90 s cycle at 72°C, and a final extension for 10 min at 72°C. The samples were then quantified using PicoGreen (Invitrogen) and a plate reader (Infinite® 200 PRO, Tecan), pooled into a single tube, cleaned using AMPure XP Beads (Beckman Coulter), quantified using a Qubit fluorometer (Thermo Fisher Scientific), diluted down to 2 nM, denatured, and diluted to a concentration of 6.75 pM with a 10% PhiX spike. All amplicons were sequenced on a Miseq as a 2 × 151 bp paired end run (Caporaso et al., 2010, 2011). During preliminary analysis, the negative controls submitted for sequencing yielded low abundance, poor quality reads, and therefore, could not be sequenced.
Only the paleosols were further processed using a shotgun sequencing approach. For shotgun sequencing, DNA from the triplicate soil samples was combined and sequenced on a HiSeq 2500 sequencer (Illumina) to obtain 2 × 100 bp reads with a PE Rapid Kit v2 chemistry at the Environmental Sample Preparation and Sequencing Facility at Argonne National Laboratory. In brief, size selection was performed using the Blue Pippin Prep (Sage Science, Inc.) and the 3% cassette. The cassette was inspected, loaded, and QC with a continuity test before introduction of sample. Libraries were introduced to the cassette in 30 µl with 10 µl of dye and ran on the Blue Pippin for 30–56 min. Concentration was checked using a DNA HS assay on the Qubit Fluorometer and molarity of each sample was calculated. As further PCR was unnecessary based on the molarity being greater than 2 nM, that step was skipped. Using the Qubit Fluorometer, the concentration (ng/µl) of the libraries was determined and the Agilent 2100 Bioanalyzer was used to determine the library insert size and quality. Sequences were then run on an Illumina HiSeq 2500.
Community analysis
Bacterial and fungal sequence reads were analyzed using Quantitative Insights into Microbial Ecology (QIIME; Caporaso et al., 2010). For bacterial sequences, forward and reverse reads were joined together using fastq-join (Aronesty, 2011). Only the forward reads were used for fungal sequences because the majority of reads did not join successfully. Matching sequences that did not overlap at a minimum of six base pairs with a maximum of 8% difference between reads were discarded and the index reads were filtered to account for the remaining joined reads. Joined bacterial reads and forward fungal reads were then quality filtered and demultiplexed into their respective sample libraries. Following demultiplexing, the filtered sequences were processed through the QIIME Open Reference OTU Picking Pipeline within which all sequences were compared to the reference database (Kõljalg et al., 2013; McDonald et al., 2012) (QIIME internal Greengenes database for bacterial sequences and UNITE fungal database 12/11 release) at 97% confidence using the uclust program to cluster them, assigned OTUs, and then a high confidence representative sequence was selected. Sequences that failed to cluster at the previous step were then clustered de novo and representatives were selected and reclassified according to cluster matches. All successful clusters were then drafted into an OTU table, assigned taxonomy down to the genus level with uclust (Edgar, 2010) for bacteria or BLAST (Altschul et al., 1990) for fungi using the representative sequences as a guide with a minimum of 51% consensus, and a phylogenetic tree was constructed for the bacterial dataset using Fasttree (Price et al., 2010). Relative abundances of microbial taxa were recalculated to account for only the top phyla or class (constituting 98% of the community) excluding unknown or unassigned OTUs.
Principal coordinates were derived from 1000 randomly subsampled sequences from the bacterial sequences and the weighted Unifrac metric was used (Lozupone et al., 2011). Rarefaction curves were generated to check for coverage (Supplemental Figures S1 and S2, available online). Principal coordinates for fungal sequences were derived from 16,000 randomly subsampled sequences and the Bray Curtis metric was used (Bray and Curtis, 1957). The betadisper function was used to determine beta dispersion (Anderson, 2006) and the adonis function was used to identify significant differences between modern and paleosol samples with a PERMANOVA (Anderson, 2001).
Indicator bacterial and fungal genera were identified by indicator species analysis using the indicspecies package in R (Cáceres and Legendre, 2009; Dufrene and Legendre, 1997; R Core Team, 2014). This package determines the relationship between species abundance values within a set of sites and the groupings of those sites. Groups were manually assigned based on sample group and Monte Carlo permutation tests with 9999 permutations were used to test the indicator values.
Shotgun sequences were provided as demultiplexed paired-end FASTQ files. Adapters were trimmed and then shotgun metagenomic forward and reverse reads were joined together using fastq-join as described above. Joined reads were then assessed for quality such as degraded edges, high abundance of adapters, and low quality regions using FastQC (Andrews, 2010). Exact duplicate and low quality sequences were then filtered from the file to reduce PCR bias and the remaining sequences were passed to DIAMOND (Buchfink et al., 2015). DIAMOND was run in blastx mode against the National Center for Biotechnology Information’s Non-redundant protein (NCBI nr) Database (www.ncbi.nlm.nih.gov; downloaded February 2017) with an e-value cutoff of 0.001. DIAMOND output files were then passed to MEGAN6 (Beier et al., 2017) for visualization, taxonomic, and functional assignment. All samples were normalized to the minimum number of sequences for the entire sample set (18,680,958). Data in Figures 4 to 8 reflect the log10 transformed abundance of those sequence counts. Within MEGAN6, taxonomy was assigned using the protein to accession mapping file based on the last 2016 version of the NCBI nr protein database with the default lowest common ancestor parameters. Functional roles were assigned using NCBI nr refseq mapping identities to the SEED classification (Overbeek et al., 2014). Phylogenetic and metabolic trees were constructed using the count data generated from the previous steps and the Interactive Tree of Life software (Letunic and Bork, 2021).
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The sequence datasets generated during the current study have been submitted to the National Center for Biotechnology Information repository (NCBI; https://www.ncbi.nlm.nih.gov/) Sequence Read Archive (SRA). The bacterial 16S, fungal ITS, and shotgun metagenome sequences can all be found under the bioproject number PRJNA438924.
Results
Soil properties
Major differences in soil properties were observed between the modern soils and the paleosols. The gravimetric water content (GWC) of the modern soils, AK-1 and AK-2, were significantly higher than in the paleosol samples at AK-3 (p < 0.0001; Table 1). The modern soils also harbored a significantly (p < 0.0002) higher organic matter (OM) content than the paleosols (Table 1). The soil pH of the modern soils was significantly lower than in paleosols (p < 0.0001; Table 1). Generally, soil pH increased steadily with depth, with the pH of older paleosols (e.g. Paleosols 7 and 6) more basic from 6.4 to 8.7 than the younger paleosols (e.g. Paleosols 3 and 2) at pH of 3.0.
Soil properties.
Values listed as average of three replicates and the standard error. GWC indicates gravimetric water content, OM indicates organic matter content, and pHw and pHs indicate pH measured using a water or dilute salt solution, respectively.
Sequencing output
We used amplicon and shotgun sequencing to assess microbial diversity. Specifically, bacterial and fungal amplicons were analyzed to determine differences in taxonomy of microbial communities present in modern soils and Holocene paleosols. On average, the number of sequences obtained for bacteria and fungi across all samples were 2.35E05 ± 3.76E04 and 1.17E5 ± 1.11E04, respectively. We used a shotgun sequencing approach for a deeper understanding of taxonomic and functional differences among Holocene paleosols and obtained an average of 5.59E07 ± 9.87E06 sequences. Numbers of sequences, impact of quality filtering, and types of database hits can be found in Supplemental Table S1, available online. Indicator species analysis (Cáceres and Legendre, 2009; Dufrene and Legendre, 1997)was used to determine if particular taxa were structuring the microbial communities (Supplemental Table S2, available online). It was not used on the shotgun sequencing dataset because they were composite samples.
Differences in community composition between modern soils and paleosols
Community composition in modern soils and paleosols of varying ages were compared using principal coordinates analysis (PCoA) of the amplicon sequencing datasets from each sample (Figure 2). Three samples (one replicate from Paleosol 4 and two replicates from Paleosol 7) were removed from analysis of bacterial amplicon sequences due to poor sequence quality. Bacterial sequences from modern soil samples were well separated from paleosols by PCoA, indicating that community composition in the modern soils was different from those in the ancient soils (Figure 2a). Beta dispersion analysis and PERMANOVA showed significant (p < 0.001) differences between samples, suggesting heightened variability in microbial community structure. Additionally, the bacterial communities from the older paleosols (Paleosol 7 and 6) were separated from younger paleosols (Paleosols 5 through 2) in the PCoA, indicating differences between the older and younger paleosols (Figure 2a). Beta dispersion analysis revealed no significant differences and PERMANOVA analysis indicated that differences in bacterial communities between young and old paleosols were significant (p = 0.002).

Principle Coordinates Analysis of (a) bacterial and (b) fungal sequences from modern soils (AK-1 and AK-2) and ancient soils (Paleosols 7–2). Each point represents one soil sample.
For fungal amplicon sequencing, one replicate from Paleosol 7 was removed from PCoA due to poor sequence quality. Fungal communities within the modern soils and ancient soils were separated by PcoA indicating different community compositions (Figure 2b). PcoA of fungal community sequences from modern soils were more similar to those of younger paleosols (Paleosols 5 through 2) than older paleosols (i.e. Paleosol 7 and 6; Figure 2b). There were no significant differences in beta dispersion analysis between these two sample types and the PERMANOVA results showed significantly (p < 0.001) differences between the sample types. Interestingly, modern soils and younger paleosols had more similar fungal communities than what occurred for the bacterial communities (Figure 2). There still were two distinct categories of similar fungal communities in the PCoA. These included the fungi in paleosols younger than 7000 calBP and those in paleosols older than 7000 calBP (Figure 2b). The effect of beta dispersion between younger and older paleosols was significant (p < 0.001).
Dominant bacterial taxa
Proteobacteria and Actinobacteria were the dominant phyla in all soil samples regardless of age (Figure 3a). Differences in relative abundance were observed between the modern soils, younger, and older paleosols (Figure 3a). For example, the paleosols had a higher abundance of Chloroflexi and Gemmatimonades, whereas the modern soils had a higher abundance of sequences related to Verrucomicrobia (Figure 3a). Though Cyanobacteria were detected in the modern soils, they may include potential plant chloroplasts, which accounted for an average of 0.9% of the sequences. Sequences relating to Proteobacteria and Acidobacteria phyla were more abundant in modern soils than in paleosols with the exception of Proteobacteria in Paleosol 7. Additionally, bacterial sequences related to OD1 were more abundant in the older paleosols (older than 7000 calBP) than the rest of the samples. Indicator species analysis identified 51 bacterial genera belonging to Acidobacteria, Actinobacteria, Proteobacteria, Firmicutes, and Bacteroidetes were indicators of one or more soil groups (Supplemental Table S2, available online). Indicator species analysis distinguished 25 species with the highest indicator values for Brevibacterium, Micrococcus, Rothia, Bifidobacterium, Granulicatella, Phascolarctobacterium, Paracoccus, Rubellimicrobium, Geobacter, Erwinia, and Enhydrobacter that were unique to Paleosol 7 (indicator value (IV) = 1, p = 0.0439; Supplemental Table S2, available online). However, indicator analysis did not reveal any indicators for Paleosols 6 through 4. Paleosol 3 had one indicator genera, Pigmentiphaga (IV = 0.944, p = 0.004), and Paleosol 2 had four indicator genera, the most prominent being Roseateles (IV = 0.931, p = 0.0128; Supplemental Table S2, available online). There were 21 indicator genera identified for the modern soils with the six most indicative being Terriglobus, Micromonospora, Sphaerisporangium, Sediminibacterium, Acidocella, and Luteibacter (IV = 1, p = 0.0001; Supplemental Table S2, available online).

Relative abundance of dominant (a) bacterial phyla and (b) fungal classes from modern soils (AK-1 and AK-2) and ancient soils (AK-3: Paleosols 7–2). Each bar represents an average of the relative abundance of sequences from triplicate soil samples. For fungal classification the colored dot corresponds to the phylum for that taxa, green = Ascomycota, blue = Basidiomycota, red = Glomeromycota, and purple = Zygomycota.
Dominant fungal taxa
Fungal classes were more variable with soil age than what was found for the bacterial phyla. Particularly, the dominant taxonomic groups in the older paleosol samples were strikingly different from those in both the younger paleosol and the modern soils. The four dominant phyla detected were Ascomycota, Basidiomycota, Glomeromycota, and Zygomycota (Figure 3b). Of these, Ascomycota was prevalent in Paleosols 3 and 2 while Basidiomycota was the most abundant phylum in the Palesols 5 and 4, and the modern soils (Figure 3b). Detection of Glomeromycota was sporadic and did not exhibit any clear pattern relating to age of the soil (Figure 3b). Members from the phylum Zygomycota were detected in Paleosol 2 and the modern soils (Figure 3b). An unidentified Ascomycota class was more abundant in Paleosols 3 and 2 and members from the class Sordariomycetes were most abundant in Paleosol 2 (Figure 3b). At the class level, Pezizomycetes were more abundant in the older Paleosols 7 and 6 (Figure 3b). In fact, Pezizomycetes was the only class detected in Paleosol 7 (Figure 3b). Twenty-seven significant fungal genera were identified through indicator species analysis. Though none were significant for Paleosols 7 through 4, Ramariopsis (IV = 0.993, p = 0.0029) and Cudoniella (IV = 0.814, p = 0.0224) were identified as indicators from Paleosol 3 (Supplemental Table S2, available online). Additionally, Paleosol 2 had three indicator genera Monosporascus (IV = 0.998, p = 0.0045), Trichothecium (IV = 0.977, p = 0.0161), and Puccinia (IV = 0.812, p = 0.0224; Supplemental Table S2, available online). The modern soil had 22 indicators, with the best indicator being Troposporella (IV = 1, p = 0.0004), a member of the phylum Ascomycota (Supplemental Table S2, available online).
Archaeal presence in amplicon sequencing data
Because archaea play an important role in biogeochemical cycling, such as ammonia oxidation and methane cycling, we attempted to analyze the archaea present in the modern soils and paleosols through the 16S rRNA gene dataset. The detection of archaea was limited, represented by less than 1% of the entire dataset. Additionally, class representation was the highest taxonomic resolution that could be achieved, possibly due to the scarcity of representation in the database or likely bias in primer set used or sequencing (Wear et al., 2018). Therefore, we discuss their role in more detail using the metagenomics dataset.
Taxonomic analysis of the paleosol chronosequence using metagenomics
We performed shotgun sequencing on pooled DNA extracted from each of the paleosol samples to further define taxonomic diversity and identify gene function. After quality control measures, including removal of exact duplicates and joining of paired end sequences, the number of reads for Paleosols 7, 6, 5, 4, 3, and 2 were 89.6, 82.2, 44.0, 49.8, 30.6, and 39.2 M respectively (detailed in Supplemental Table S1 and Figure S2, available online). The sequences related to taxonomy were dominated by bacteria, with the top five most abundant phyla being Proteobacteria, Actinobacteria, Acidobacteria, Chloroflexi, and Gemmatimonadetes (Figure 4). This differs slightly from our amplicon sequencing results which identified the top five phyla as Proteobacteria, Actinobacteria, Planctomycetes, Chloroflexi, and Bacteriodetes (Figure 3a). Overall, there was a decrease in taxonomic diversity from the oldest to the youngest paleosol (Figure 4 and Supplemental Figure S2, available online). In particular, Paleosols 7 and 6 exhibited a compelling increase of identified taxa, more than triple the number when compared to the younger paleosols. These paleosols also had significantly higher diversity based on Shannon and Simpson diversity indices (Figure 4 and Supplemental Figure S2, available online). Members from the groups Ignavibacteriae, Holophagae, Chlamydiae, Lentisphaerae, and Deinococcus-Thermus were present in Paleosols 7 and 6, but were not detected in Paleosols 5 through 2, suggesting the reduction in abundance or loss of those organisms during progression to modern times. Members from the groups Nocardia, Friedmanniella, Prauserella, Sciscionella, and Lactobacillus were detected in at least one of the younger paleosols tested.

Phylogenetic tree showing the log scale taxonomic abundance of bacterial sequences retrieved from the paleosols using shotgun sequencing. The innermost ring is the phylogenetic tree. The log abundances of the sequences are shown from youngest paleosol (red) to oldest paleosol (purple). Results are derived from one replicate of pooled DNA from each of the paleosols. The outermost ring indicates the taxonomic grouping.
Across all paleosols tested, 1613 individual species were identified, the majority of which were only detected in Paleosol 7 or 6. Of these, 154 species were found in all paleosols, 1200 species were found in Paleosol 7 or 6, 39 of them were found only in one of the young paleosols (Paleosols 5 through 2), and two were detected in all of the young paleosols (Promicromonospora sukumoe and Candidatus Nitrocosmicus oleophilus). Though the diversity in Paleosols 7 and 6 was higher than Paleosols 5 through 2, the vast majority of species contributed less than 1% to the total sample abundance (Supplemental Figure S3, available online). Therefore, we examined the species that contributed more than 1% of the total abundance and compared them between paleosols. Acidobacteria bacterium RIFCSPLOWO2_02_FULL_67_36 (Aci380), Chloroflexi bacterium CSP1-4 (Chl421), and Gemmatirosa kalamazoonesis (Gem201) were detected in all of the paleosols (Supplemental Figure S4 and Table S3, available online). Three groupings emerged that harbored common species: Paleosols 7 and 6, Paleosols 5 and 4, and Paleosols 3 and 2 (Supplemental Figure S3, available online). In fact, Paleosols 5 and 4 by far had the most similarity sharing 95% of their top 1% species, as compared to Paleosols 7 and 6 which shared 82% of their most abundant species. Deltaproteobacteria bacterium RIFCSPLOWO2_12_FULL_60_19 (Pro126) and Betaproteobacteria bacterium RBG_16_66_20 (Pro396) were unique to Paleosols 7 and 6 (Supplemental Figure S3 and Table S3, available online). Additionally, Gaiella sp. SCGC AG-212-M14 (Act365) was unique to Paleosols 5 and 4, and Candidatus Solibacter usitatus (Aci232) was unique to Paleosols 3 and 2 (Supplemental Figure S3 and Table S3, available online).
Archaeal reads represented up to 2.3% of the total reads that matched the non-redundant database (Supplemental Table S1, available online). The highest proportion of reads was found in Paleosol 7 which also had the greatest diversity of Archaea. More archaeal groups, particularly from the Euryarchaeota phylum, class Thermoplasmata, were present in Paleosols 7 and 6, which were the oldest tested (Supplemental Figure S4, available online). Candidate archaeal group Nitrocosmicus oleophilus was detected in Paleosols 5 through 2, Nitrososphaera gargensis was detected in Paleosols 5, 4, and 2, and Nitrososphaera everladensis was detected in Paleosol 2 (Supplemental Figure S4, available online).
Functional analysis of the paleosol chronosequence using metagenomics
To provide a more biologically relevant grouping of the functional families output from the SEED database, shotgun reads that aligned to the NCBI non-redundant protein database also had their SEED functional classifications aggregated into four superfamilies based on common functional properties. Cell metabolism is a collection of functional genes related to the direct operation and metabolism of individual cellular mechanisms. These functional families are responsible for the maintenance of nucleic, amino, and fatty acids within the cells as well as more specialized activities important for cellular function. The environmental metabolism superfamily includes functional genes that interact with or are the framework for the metabolism of essential nutrients and molecules, derived nutrients and substrates, and deposited substrates found in various environmental matrices. Principal among these are carbon (carbohydrates), nitrogen, phosphorus, and sulfur metabolism. This family also includes more complex molecules such as aromatic compounds and nucleotide sugars. The housekeeping superfamily encapsulates the functions necessary for the general maintenance of cellular function. Finally, we grouped functions relating to survival and virulence into the virulence and stress response superfamily. This category also includes functions which respond to environmental and anthropomorphic stresses. The most obvious difference was that for most of the functional traits, on average, 2.3% of the raw reads from Paleosols 7 and 6 matched the modern database and were classified into SEED families, while 14.5% of the raw reads from Paleosols 5 through 2 were classified into SEED families (Supplemental Table S4, available online), indicating that the younger paleosols harbored significantly more genes than the older paleosols that aligned to known functions in the SEED classification database. The decrease in the number of different genes suggests a decrease in functional diversity in the older paleosols (Paleosols 7 and 6).
The functional data contrasted with the taxonomy data and revealed an increase in diversity in the older paleosols (Figure 4). Therefore, while taxonomic diversity decreased near modern times, functional potential (defined here as genes identified as similar to those with known functions in the modern database) increased (Figures 5–8). Of the dormancy and sporulation genes surveyed, all samples showed a predominance of genes associated with sporulation, upkeep, and germination. These were most prevalent in Paleosol 3. Paleosol 3 also had higher relative abundance of genes involved in stress response such as SigmaB stress response regulation, selenite and selenite uptake, choline and betaine production, and uptake as a form of osmoprotection. Though the direct counts of genes were generally lower for the older paleosols, the relative abundance of genes for oxidative stress protection, reactive oxygen species mitigation, and DnaK heat shock mitigation were higher in the older paleosols. A replicated assessment of these genes would need to be conducted to test the significance of the trends we observed here.

Metabolic tree showing selected functional genes. In the innermost ring is a tree of the functional genes tested. The next ring is the functional categories relating to cellular metabolism present in the database. The log abundances of the sequences are shown from youngest paleosol (red) to oldest paleosol (purple). The next outermost rings are a heat map of the z scores from the youngest paleosol (inner) to oldest paleosol (outer). Orange indicates a high abundance of that particular gene, while blue indicates a low abundance. The outermost ring is a code for the particular function listed in Supplemental Table S5, available online.

Metabolic tree showing genes related to Environmental Metabolism. In the innermost ring is a tree of the functional genes tested. The next ring is the functional categories relating to environmental metabolism present in the database. The log abundances of the sequences are shown from youngest paleosol (red) to oldest paleosol (purple). The next outermost rings are a heat map of the z scores from the youngest paleosol (inner) to oldest paleosol (outer). Orange indicates a high abundance of that particular gene, while blue indicates a low abundance. The outermost ring is a code for the particular function listed in Supplemental Table S6, available online.

Metabolic tree showing genes related to Housekeeping. In the innermost ring is a tree of the functional genes tested. The next ring is the functional categories relating to housekeeping genes present in the database. The log abundances of the sequences are shown from youngest paleosol (red) to oldest paleosol (purple). The next outermost rings are a heat map of the z scores from the youngest paleosol (inner) to oldest paleosol (outer). Orange indicates a high abundance of that particular gene, while blue indicates a low abundance. The outermost ring is a code for the particular function listed in Supplemental Table S7, available online.

Metabolic tree showing genes related to Virulence and Stress Response. In the innermost ring is a tree of the functional genes tested. The next ring is the functional categories relating to virulence and stress response genes present in the database. The log abundances of the sequences are shown from youngest paleosol (red) to oldest paleosol (purple). The next outermost rings are a heat map of the z scores from the youngest paleosol (inner) to oldest paleosol (outer). Orange indicates a high abundance of that particular gene, while blue indicates a low abundance. The outermost ring is a code for the particular function listed in Supplemental Table S8, available online.
Discussion
Major microbial groups found throughout the Holocene
There were discrete shifts in the bacterial and fungi between modern soils and the older paleosols, suggesting differential selective pressures driving community composition in these soils. Though the study is limited spatially to one chronosequence and the paleosol microbiome may not harbor the same microbes at the time of deposition, the findings offer fundamental knowledge of early microbiome communities and potential functional attributes as early as 8000 years ago because the cold, dry conditions likely promoted the preservation of substrates, which drive microbial community taxonomy and function (Glassman et al., 2017; Nunan et al., 2017). Soil properties such as soil moisture and soil pH were significantly different between the modern and ancient soils. Though organic matter content was similar, the thickness of the organic horizons were different in the paleosols. The quality of organic matter, and whether it was labile or stable, was likely very different between the samples (Petsch et al., 2001). Furthermore, all three soil properties tested in this study are known drivers of microbial taxonomy in soils (Howard and Howard, 1979; Riveros-Iregui et al., 2007; Rousk et al., 2009), making it difficult to assess whether the soil conditions or the age of deposition influenced community dynamics between the modern soils and the paleosols. A more detailed chemical analysis of the paleosols would inform if soil conditions are the primary driver of microbial taxonomy in these soils.
Despite differences in soil properties, bacterial phyla Proteobacteria and Actinobacteria, and fungal classes Agaricomycetes and Leotiomycetes were dominant in most of the soils tested, indicating the persistence of these groups through time. Proteobacteria are a large phylum of bacteria that have broad morphological and metabolic diversity, particularly for nitrogen and sulfur cycling (Kersters et al., 2006). In support of this, the most abundant families were Hyphomicrobiaceae, Bradyrhizobiaceae, Rhodospirillaceae, and Oxalobacteraceae, which are known to conduct nitrogen cycling reactions. For instance, generalists Hyphomicrobiaceae and Bradyrhizobiaceae use and accept many forms of nitrogen for fixation and denitrification (Anderson et al., 2011) and specialists such as nitrogen fixing Rhodospirillaceae (Herbert, 1999) and ammonium consuming Oxalobacteraceae (Gaspar et al., 2016) introduce organic nitrogen to the soil system. Though less represented, there were members from the Rhizobiales order which are involved in the oxidation of sulfur (Li et al., 2014). Much like Proteobacteria, Actinobacteria also contain a wide array of members that degrade complex carbon substrates and make bioactive materials (Ventura et al., 2007). Additionally, Norcardioidaceae, a newly characterized family, was most abundant in the older paleosols (Paleosols 7 and 6). This family has many members capable of degrading labile substrates (i.e. fructose, mannose, and arabinose) and more recalcitrant ones (i.e. cellulose, xylan, and pectin; Tóth and Borsodi, 2014). Deinococcus-Thermus was detected in the older paleosols tested and has also been found in Holocene-aged samples collected in the central Yukon (Saidi-Mehrabad et al., 2020). Additionally, Pseudonocardiaceae, capable of degrading lignocellulose and chitin (Yeager et al., 2017), was detected in all of the paleosols tested, possibly indicating the presence of complex carbon in the ancient soils. Finally, members from the fungal classes Agaricomycetes and Leotiomycetes were present in most of the paleosols. Agaricomycetes are within the phylum Basidiomycota and decay complex vegetative matter such as wood lignin (Morgenstern et al., 2008), suggesting an inherent and long-lasting requirement to degrade complex substrates in these ancient soils regardless of climate inputs. Additionally, Leotiomycetes, a class of ascomycete fungi, has been found in boreal forest soils with low nitrogen contents (Sterkenburg et al., 2015).
While common bacterial and fungal groups were present in all of the soils tested, the abundance of these organisms between modern and ancient soils, and between the younger and older paleosols differed. While the paleosol microbial communities are unlikely to be the exact, original microbes present at the time of deposition, we believe that the cold environmental conditions for preservation and presence of legacy substrates maintained a closely related community. This enables inferences to be made about what kind of legacy microbial processes may have been present and associated with ecosystem function and broad shifts in vegetation during the period the soils were deposited. Chloroflexi, which have been also detected in alpine tundra wet meadow soils and Antarctic dry valleys (Costello and Schmidt, 2006; Pointing et al., 2009), and Gemmatimonadetes, oligotrophic bacteria native to dry environments (Cederlund et al., 2014; DeBruyn et al., 2011) were found to be more abundant in the paleosols compared to the modern soils. Ammonia oxidizing archaea from the genera Nitrocosmicus, also found in arable Canadian soils (Lay et al., 2018), and Nitrosphaera also found in a garden soil in Vienna, Austria (Tourna et al., 2011) were detected in paleosols younger than 5000 calBP. Two classes of fungi, Eurotiomycetes and Archaeorhizomycetes, were also found in paleosols younger than 5000 calBP. These fungi from the phylum Ascomycota are fairly cosmopolitan, found in both tropical ecosystems (Tedersoo et al., 2014) as well as Arctic and Antarctic soils (Cox et al., 2016). Of particular interest was the presence of members from the fungal class Pezizomycetes in paleosols older than 7000 calBP because they were also detected in modern Arctic tundra tussock and shrub soils (Wallenstein et al., 2007). The low lying vegetation aligns with the herbaceous tundra that was dominant in the mid-Holocene (Ager, 1983; Bigelow and Edwards, 2001; Edwards et al., 2001). Additionally, a sub-member of the Pezizomycetes class, Sphaerosporella, has been observed in soils that experienced frequent fires (Mikita-Barbato et al., 2015; Visser, 1995). A high percentage of P. mariana in interior Alaska vegetation assemblages was also linked to increased wildfire activity over the last 4000 years. Fire frequency increased from intervals of 300 years before 4000 years ago to intervals of 80 years in the last millennia (Hu et al., 2006; Kelly et al., 2013).
Changes observed through time and implications for nutrient cycling
To reveal differences in microbial taxonomy and function between the younger and older paleosols, we conducted a targeted investigation of the paleosol chronosequence alone. The aim of our targeted investigation was to reduce the possible effects of different soil properties on patterns in soil biodiversity. The paleosols representing the time period from 8000 to 7000 calBP (i.e. Paleosols 7 and 6) were more taxonomically diverse than the younger paleosols (i.e. Paleosols 5 through 2), indicating that diversity decreased through time. Possible explanations for this outcome include the absence of ancient functional categories in modern databases, degradation of DNA, or the presence of novel microbial functions in ancient soils. The absence of ancient functional categories is more probable, as these databases may be biased toward the degradation of labile as opposed to refractory ancient organic matter. Paleosols 5 through 2 also were more acidic, aligning with vegetation records of sphagnum growth resulting from dense tree vegetation (Kelly et al., 2013). This vegetation shift could influence the composition of the microbiome as is illustrated in this study.
Paleosols 7 and 6 harbored a unique species with high homology to Betaproteobacteria bacterium RIFCSPLOWO2_02_FULL_67_26, an unclassified Proteobacteria capable of cycling sulfur and found in an aquifer system (Anantharaman et al., 2016). Additionally, Paleosol 7 contained sequences related to Deltaproteobacteria bacterium RIFCSPLOWO2_12_FULL_60_19, which was also found in an aquifer (Anantharaman et al., 2016). Additionally, sequences related to Gemmatirosa kalamazoonesis (DeBruyn et al., 2013) was found abundantly and consistently in all of the paleosols. Though the ecological functions and survival strategies of many Gemmatimonadetes members are unknown, their ubiquitous presence in environmental samples (DeBruyn et al., 2013) suggests a complex and variable repository of mechanisms for adaptation. Interestingly, a species closely aligned to Thermoleophilum album, a thermophile that also grows in non-thermal environments and uses n-alkanes as a sole substrate (Zarilla and Perry, 1984) was found exclusively and abundantly in Paleosols 5 through 2. Its presence in Paleosol 5 may provide evidence of a natural or anthropomorphic source of hydrocarbons in the area beginning approximately 5000 calBP that warrants further investigation. Although the species identified may only be the closest modern analog to the actual species, these inferences are a jumping off point to inform further studies into what the community may tell us about the past environment.
Overall, the younger paleosols exhibited greater functional diversity than their older counterparts even though the younger ones harbored lower taxonomic diversity. Therefore, fewer organisms had a more versatile genetic capacity to break down select substrates and respond to stressors. Of all of the functional attributes tested, we were most interested in those that indicated microbial survival and environmental metabolism. Specifically, genes regulating oxidative stress protection, reactive oxygen species mitigation, and DnaK heat shock mitigation were more abundant in the older paleosols. DnaK is a protein involved in counteracting protein carbonylation, or the modification of side chains of amino acids caused by oxidative stress (Fredriksson et al., 2005). Presence of the DnaK protein suggests selective pressures to combat free radicals in the system. Furthermore, the potential for functional redundancy combined with low biodiversity in the older paleosols may suggest that a similar function was performed by multiple groups of organisms. Because microorganisms regulate nutrient cycling as the primary producers of the system, they in turn greatly affect nutrient availability for higher trophic organisms such as plants (Setälä et al., 2005). The loss of species, as may have occurred in this system, could be linked to elevated sensitivities to environmental changes such as moisture loss during drought (Yin et al., 2000). Future work could include investigating plant indicators in these paleosols to make a direct comparison to microbial patterns observed in this study. If the process is dictated by few species, then when species abundance declines, critical functions may be lost. A replicated assessment of these samples is needed to corroborate the initial trends we observed. Additionally, future directions include assessing the quality of organic matter in the ancient soils and the active microflora using the relic carbon.
Conclusion
As expected, the microbiome from modern soils differed from those in the paleosols, demonstrating that microorganisms in ancient soils are unique. Following a deeper investigation of the microbiome in the paleosols, our results showed a significant decrease in biological diversity, yet a significant increase in biological function from 7149 to 4468 calBP, aligning very closely with the shift to modern forest composition. These results suggest functional redundancy in ancient soil ecosystems, indicating that fewer organisms were able to perform varied functions in the soil ecosystem. The microbiomes in the older paleosols may have exhibited a higher degree of functional redundancy despite harboring a more diverse community. In future climate and ecosystem scenarios, the absence of a subset of microorganisms could in turn affect nutrient cycling for larger organisms. These findings are paramount to better understand fundamental shifts in soil microbiota occurring under transitional climates.
Supplemental Material
sj-docx-1-hol-10.1177_09596836221101249 – Supplemental material for Alaskan palaeosols in modern times: Deciphering unique microbial diversity within the late-Holocene
Supplemental material, sj-docx-1-hol-10.1177_09596836221101249 for Alaskan palaeosols in modern times: Deciphering unique microbial diversity within the late-Holocene by Robyn A Barbato, Robert M Jones, Thomas A Douglas, Julie Esdale, Karen Foley, Edward J Perkins, Shelby Rosten and Natàlia Garcia-Reyero in The Holocene
Footnotes
Acknowledgements
The authors would like to thank Dr. Benjamin Gaglioti for kindly providing a review of our manuscript, and the editors and anonymous reviewers whose comments greatly improved the manuscript.
Author contributions
RAB, RMJ, TAD, JE, and NG collected the samples; RMJ and KF processed the samples; RAB, RMJ, and SR analyzed the samples; RAB, TAD, EJP, and NG were involved in the conception, design, and interpretation of the data; RAB wrote the manuscript with inputs from RMJ, TAD, JE, EJP, SR, and NG.
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The sequence datasets generated during the current study have been submitted to the National Center for Biotechnology Information repository (NCBI;
) Sequence Read Archive (SRA). The Bacterial 16S, Fungal ITS, and Shotgun Metagenome sequences can all be found under the bioproject number PRJNA438924 once accepted for publication.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Funding support was provided from the United States Army Engineer Research and Development Center, Basic Research Program Office, Environmental Quality and Installations.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
