Abstract
Successful conservation planning for the Canadian boreal forest requires biodiversity data that are both accessible and reliable. Spatially exhaustive data is required to inform on conditions, trends and context, with context enabling consideration of conservation opportunities and related trade-offs. However, conventional methods for measuring biodiversity, while useful, are spatially constrained, making it difficult to apply over wide geographic regions. Increasingly, remotely sensed imagery and methods are seen as a viable approach for acquiring explicit, repeatable and multi-scale biodiversity data over large areas. To identify relevant remotely derived environmental indicators specific to biodiversity within the Canadian boreal forest, we assessed indicators of the physical environment such as seasonal snow cover, topography and vegetation production. Specifically, we determined if the indicators provided distinct information and whether they were useful predictors of species richness (tree, mammal, bird and butterfly species). Using cluster analysis, we also assessed the applicability of these indicators for broad ecosystem classification of the Canadian boreal forest and the subsequent attribution of these stratified regions (i.e. clusters). Our results reveal that the indicators used in the cluster creation provided unique information and explained much of the variance in tree (92.6%), bird (84.07%), butterfly (61.4%) and mammal (22.6%) species richness. Spring snow cover explained the most variance in species richness. Results further show that the 15 clusters produced using cluster analysis were principally stratified along a latitudinal gradient and, while varied in size, captured a range of different environmental conditions across the Canadian boreal forest. The most important indicators for discriminating between the different cluster groups were seasonal greenness, a multipart measure of climate, topography and land use, and wetland cover, a measure of the percentage of wetland within a 1 km2 cell.
I Introduction
The Canadian boreal forest (or just ‘boreal’) has marked ecological, cultural and economic importance. Extending over 5.5 million km2 (Brandt, 2009), Canada’s boreal provides many essential ecosystem services such as regulating regional and global climate, sequestering large amounts of carbon, and purifying water. It is one of the world’s largest contiguous forests, making up 25% of the world’s remaining large frontier forests (Lee et al., 2003). The Canadian boreal’s expansive forests, wetlands and lakes also offer essential habitat for a diverse range of plants and animals, including over 90 endangered species such as the whooping crane (Grus americana), woodland caribou (Rangifer tarandus) and grizzly bear (Ursus arctos horribilis). The boreal forest also supports continental populations of many bird species, with 450 bird species directly depending on it and up to five billion birds migrating to the region annually (NRCan, 2011). Forest cover is dominated by a number of cold-tolerant tree species from genera such as fir (Abies spp.), spruce (Picea spp.), tamarack (Larix laricina), poplar (Populus spp.) and pine (Pinus spp.).
The boreal forest contains an abundance of natural resources both renewable, such as productive soils and timber stocks, and non-renewable, such as mineral deposits and energy reserves. For natural resources such as timber, only a subset of the boreal is subject to harvesting, most commonly in the south. Mining and oil/gas exploration/extraction are of more regional importance, but are of notable economic benefit, with an estimated potential net market value at $14.5 billion CAD (est. 2002) per annum (Anielski and Wilson, 2009). Human activity, often in the form of land conversion (e.g. agriculture, mining, road construction), continues to expand into Canada’s boreal forest. Depending on opportunity and market forces, this ongoing expansion, coupled with the impacts of climate variability, has the potential to expedite the loss of biodiversity and habitat within the region. Much of the northern reaches of the boreal forest is subject to limited management activities, with forest productivity generally too low and transportation costs too high to support industrial harvesting (Wulder et al., 2007). Further, fire suppression is not practised in the northern boreal, with few ignitions leading to large areas burned annually, on average approximately 2 million ha (Stocks et al., 2003). In southern areas of the boreal, where human access is easier, a wide range of anthropogenic disturbances are present, including land-use changes associated with expansion of urban areas and agricultural activities. In these areas, active forest management occurs where forests following harvest are replanted and monitored to ensure sufficient natural regeneration. Thus, harvested forests can follow a successional trajectory and return to a pre-harvest state. In areas where fire rates are low, whether due to climate or effective fire suppression, forest harvesting and other industrial activities can be the primary agents of landscape level disturbance (Schneider et al., 2003).
In order to conserve and protect the boreal ecosystem and safeguard against biodiversity loss, the creation of additional protected areas is an important option. Approximately 9.8% (977,621 km2) of Canada, comprising 8.1% of the boreal (448,178 km2), is currently under some form of protection. While the appropriate level of protection is a source of much discussion, this figure falls short of the national protection target of 12% often considered as a minimum standard (Brundtland Commission, 1987). Recognition of the uniqueness of the boreal, in terms of both its ecological value and its remaining high conservation potential, has inspired debate around the appropriate level and type of protection. For instance, the Canadian Boreal Initiative promotes the protection of 50% of Canada’s boreal forest through a series of large interconnected protected areas (CBI, 2011). The governments of Québec and Ontario are developing plans to protect up to 50% of the more northern areas of boreal forest within their jurisdictions (FNP, 2011; Plan Nord, 2011) mostly beyond the northern limit of commercial forestry operations. Due to existing land use and the complexity of the trade-offs required, the potential for substantial new areas of protected wilderness in more southern areas is limited. Andrew et al. (2012) indicate that much of the boreal, due to the lack of access and low productivity, is essentially functioning as though under protection; that is, the areas are de facto protected areas. While this point can be debated, the need for protection of more southerly locations at greater risk to conversion and as harbours of greater diversity, may merit more emphasis. Area-based protection targets may lead to the protection of more remote and low productivity environments at the expense of more spatially limited, yet at greater risk, ecosystems. Consideration of value-based, as well as area-based, protection targets and strategies is warranted (Andrew et al., 2011). In 2010, the Canadian Boreal Forest Agreement (CBFA) was signed between the Forest Products Association of Canada (FPAC) and prominent environmental organizations, whereby a key commitment was made to collaboratively scope and encourage the completion of new protected areas within the boreal (CBFA, 2011). CBFA recommendations for the expansion of protected areas are given due consideration by the relevant governments (provincial, federal and territorial), who have ultimate stewardship responsibilities and oversee park legislation and policy. These initiatives highlight the growing recognition of the boreal’s importance and urgency for protection.
While the actual extent of the Canadian boreal forest to be conserved remains uncertain, the identification and monitoring of biodiversity components represents a logical first step to defining and maintaining an effective future protected area network. Presently, protected areas in Canada are slightly biased to the protection of low productive environments (Andrew et al., 2011) typically found in the northern boreal or alpine regions. However, assessing biodiversity presents a challenge for Canada’s boreal forest given its size, remoteness and difficult terrain. Conventional field-based techniques to acquire biodiversity data are logistically difficult and expensive. Remote sensing methods and imagery from spaceborne/airborne platforms offer a practicable and cost-effective means of directly and indirectly characterizing certain aspects of biodiversity over large areas that is spatially explicit, repeatable and multi-scale (Kerr and Ostrovsky, 2003; Turner et al., 2003). Land-cover maps are examples of direct diversity indicators; these are used to model and predict features of biodiversity such as species composition or abundance (Turner et al., 2003). Indirect diversity indicators measure physical environmental variables that affect species distributions and communities and typically include climate, topography, vegetation productivity and disturbance metrics (Duro et al., 2007; Turner et al., 2003).
Over the last decade, there has been an increasing recognition of the potential and applicability of remote sensing derived indicators for assessing biodiversity, particularly for conservation planning applications (Buchanan et al., 2008; Kerr and Ostrovsky, 2003; Pettorelli et al., 2005; Turner et al., 2003). In the Canadian boreal forest, numerous studies have successfully used indicators to map or model species diversity or richness within taxa (Table 1). For instance, Coops et al. (2009b) used remote sensing derived estimates of productivity, land cover and topography to predict bird species richness across the province of Ontario, Canada. They found that the environmental descriptors were useful for predicting bird species richness, where land cover was indicated as being the driving variable of species richness while vegetation productivity and energy indicators played a pivotal role in defining the amount of species within different habitat types. In a Canada-wide study, Kerr et al. (2001) investigated butterfly species richness and its relationship to energy and climate data versus remotely derived heterogeneity data. The authors found that >90% of the variability in species richness was explained by land-cover heterogeneity combined with secondary effects of climate.
A sample of approaches used to map or model biodiversity (species richness) using remotely sensed data in Canada for six general taxonomic classes.
From these studies, we conclude that remote sensing technology can be used to assess biodiversity over large areas such as the Canadian boreal forest. However, determining which biodiversity indicator(s) to use is not trivial, particularly if the goal is to monitor the spatially and temporally complex mosaic of flora, fauna and ecological processes that make up the boreal region.
Ensuring that new reserves contribute optimally to the representation of regional biodiversity is a fundamental design goal of systematic conservation planning (Margules and Pressey, 2000). To meet this goal, remotely derived biodiversity indicators are often employed to classify regions into areas with similar biodiversity characteristics, especially if there is a shortage of species distribution data (Trakhtenbrot and Kadmon, 2005). These groupings, typically labelled as regionalizations, ecoregions, environmental domains or clusters, are associated with unique combinations of environmental conditions, which in theory should be representative of species diversity (Belbin, 1993, 1995; Mackey et al., 1988; Trakhtenbrot and Kadmon, 2005). These groups will be referred to as clusters from this point forward. In targets-based approaches to conservation planning (Carwardine et al., 2009), the objective is to delineate a protected areas network that encompasses a minimum proportional area of each cluster within a protected areas network. Supplementary indicators that describe the state of these distinct clusters, such as forest fragmentation, road density or other measures of disturbance, can also be used to verify the uniqueness and assess the suitability of areas for inclusion in conservation networks (Wulder and Franklin, 2007).
The goals of this research are to review a variety of spatially explicit and remotely derived biodiversity indicators, to assess their suitability for characterizing biodiversity within the Canadian boreal forest, and to explore their potential uses in conservation planning. To achieve these goals we (1) compare a variety of biodiversity indictors based on freely available data sets, primarily from MODerate resolution Imaging Spectrometer (MODIS), (2) evaluate the utility of biodiversity indicators for a classification of the boreal into environmentally distinct clusters, and (3) determine if supplementary indicators are useful for further describing and understanding the developed clusters.
II Background
In a review of remote sensing for national biodiversity monitoring in Canada, Duro et al. (2007) recommended four key indicators: vegetation productivity, disturbance, land cover, and the physical environment (e.g. topography). Using these broad categories as the basis, it is beneficial to judiciously consider additional indicators that may be important regionally. For instance, in Canada’s boreal, seasonal snow cover influences climate and hydrological processes (Pulliainen, 2006) and can be used to gauge the amount of vegetation a landscape can support throughout the year for food resources and animal habitat (Coops et al., 2009a). As well, the boreal peatlands warrant special conservation attention as they host a major proportion of Canada’s plant biodiversity and provide habitat to many animals, some of which are restricted to those environments (Warner and Asada, 2006). As wetlands are not always well identified by standard remote-sensed products, there is a need for specialized indicators of their presence.
Adapting from Duro et al. (2007), we identified and developed a set of direct and indirect environmental variables designed to map biodiversity in the Canadian boreal forest. Due to the large spatial extent of the boreal, we focused on freely available, ecologically meaningful products derived from on broad-scale sensors (≥30 m resolution), as appropriate for national or continental assessments. The coarse and medium spatial resolution remotely sensed data explored in this paper measure (1) land cover, (2) topography, (3) vegetation productivity, (4) disturbance and fragmentation and (5) snow cover.
1 Land cover
Land-cover maps derived from remote sensing differentiate broad plant communities. This mapping provides an opportunity for directly assessing plant biodiversity, or indirectly through species distribution models having land cover as a covariate. Recently, numerous broad-scale studies have successfully employed remote sensing to provide land-cover type information on Canada and the boreal in general (Table 2). For instance, a Canada-wide land-cover map of forested areas with 23 land-cover classes was produced using Landsat-7 ETM data (25 m) meeting a target overall accuracy of 85% in tested areas (Wulder et al., 2007, 2008a).
Examples of studies which have used remotely sensed or other geospatial data to map or model environmental indicators related to biodiversity in the Canadian boreal forest.
Direct mapping of vegetation species assemblages has also been used to derive information related to regional biodiversity across the Canadian boreal. For example, forest species composition of the western subarctic treeline was mapped using 1 km AVHRR data with a canopy reflectance model, an approach that uses reflectance and transmittance values of different canopy components as well as known structural relationships to estimate species composition and predict change over time (Olthof and Pouliot, 2010). The study’s results revealed a good spatial correlation (0.85 to 0.98) between the five treeline zones and recently validated MODIS vegetation data. Similarly, a 250 m categorical land-cover product (Latifovic et al., 2008) was an important covariate in models of boreal-wide distributions of 103 songbird species (Cumming et al., forthcoming). In general, land-cover maps play an important role in conservation planning and reserve design by providing both direct and indirect indices of biodiversity. As indicators of habitat type they can be directly used in setting conservation targets for specific habitat types.
2 Topography
Topography generates environmental gradients that influence regional biodiversity by shaping species distribution patterns and productivity and influencing natural disturbances (Dorner et al., 2002; Swanson et al., 1998). Numerous topographic indicators and variables derived from elevation data (Table 2) are inputs to ecological mapping (MacMillan et al., 2004). For example, Temini et al. (2010) successfully used a DEM in conjunction with microwave and vegetation data to model soil wetness for the Peace Athabasca Delta in central Canada. Their findings showed a 70% correlation between the soil wetness map and observed precipitation data. A separate study by Anderson and Ferree (2010) found that topographic data was a useful predictor of species diversity across the Maritime Provinces and southeastern Quebec. Specifically, range in elevation, paired with geological variables and latitudinal position, was able to predict species diversity with high certainty (adjusted R2 of 0.94).
3 Vegetation productivity
Vegetation productivity, measured as rate of gross or net carbon fixation, is directly linked to biodiversity. More productive areas provide more available energy and energy pathways than less productive areas, hence supporting larger populations of species and higher species diversity (Walker et al., 1992). Vegetation productivity derived from remote sensing can be considered a predictor of species richness and can identify regional biodiversity ‘hotspots’ (Rocchini et al., 2007). A suite of remotely sensed indicators related to vegetation productivity have been developed and applied to the boreal forest (Table 2). The Normalized Difference Vegetation Index (NDVI) derived from AVHRR data was used by Goward et al. (1985) to analyse seasonal vegetation patterns for North America. More recently, Coops et al. (2008) used productivity variables from MODIS data, including measures of the fraction of available incoming energy used by vegetation (fPAR) as the basis of a Dynamic Habitat Index (DHI). DHI, combined with indicators of topography and land cover, effectively models overall habitat diversity and regional biodiversity across Canada.
4 Disturbance and fragmentation
The main natural disturbances in the Canadian boreal forest are wildfire and insect outbreaks. Flooding, storms, pathogens and animal activity like that of moose (Alces alces) or beaver (Castor canadensis) (Engelmark, 1999; Esseen et al., 1997; Kuuluvainen, 2002) are also widespread, but of secondary and local importance in influencing the structure and composition of boreal ecosystems. Each process has its characteristic duration, frequency, intensity and spatial extent (Coops et al., 2007). Disturbances affect biodiversity through altering landscape structure and ecosystem function (Turetsky and Louis, 2006), which can also change habitat characteristics and impact species’ population dynamics (Kuuluvainen, 2002). Given the long duration and diffuse spatial extent of some disturbances such as gradual infestations of insect defoliators, accurate remotely sensed indicators can be difficult to develop (Coops et al., 2007). Stand-replacing disturbances, such as those caused by fire or harvesting, are more readily identified and can be successfully monitored using remote sensing technology. A Canada-wide study by Li et al. (2000), for instance, developed an algorithm for use with AVHRR satellite data (1 km spatial resolution) to monitor fire within the boreal. In addition to providing a consistent nationwide fire database, the study also demonstrated that it was possible to use a remote sensing approach to monitor fires in near real time. Mildrexler et al. (2009) evaluated disturbance in woody ecosystems across North America with an improved disturbance detection algorithm using MODIS satellite data for the years 2002–2006. Their method could detect the location and extent of wildfire, identify areas with downed trees resulting from large hurricanes and identify large logging disturbances.
Fragmentation, often defined as the breaking apart of contiguous areas of habitat (Fleishman and MacNally, 2007), represents another important aspect of landscape spatial pattern that can affect biodiversity. Habitats can become fragmented through disturbances brought about by both natural processes, such as fire and insect outbreaks, and anthropogenic activity, such as logging or road construction (Linke et al., 2007). Fragmentation can be viewed as a state indicating the juxtaposition of land-cover conditions over a land base, or can inform on process when considered over time. As with disturbance, remote sensing technologies have been employed to monitor fragmentation across Canada (Soverel et al., 2010; Wulder et al., 2008b, 2009).
5 Snow cover
Biodiversity is highly influenced by climate (Gaston, 2000). Climatic factors act as regional and global determinants of biodiversity by limiting species establishment and occurrence, and by regulating vegetation seasonality (Sarr et al., 2005). Climatic factors are highly predictive of spatial patterns in species richness at higher latitudes (Currie and Paquin, 1987). Compared to other type of variables, climate factors are, in aggregate, the most important determinants of bird species richness (Hawkins et al., 2003) and bird species’ distributions (Cumming et al., forthcoming) in the boreal forest. Of particular relevance to seasonally snow-covered areas like the Canadian boreal are climatic variables pertaining to snow, ice cover or albedo. In the case of mammals, snow can sometimes play an essential role in seasonal habitat requirements. For example, Copeland et al. (2010) used MODIS snow-cover composites in conjunction with interpolated temperature data to develop a more accurate map of the wolverine’s (Gulo gulo) current circumboreal range. Thus, by including snow cover, we can potentially describe the habitat of this species that, though widely distributed, occurs at very low densities and is of conservation concern.
III Study area
The study area is defined by the North American boreal zone (∼5.37 million km2) as described by Brandt (2009), but will not include water features and the southern hemiboreal subzone (Figure 1). Non-forested areas consist of (1) wetlands and barren ground (∼24%), (2) rivers and lakes (∼9%) and (3) Alpine (∼5%) (Brandt, 2009). The northern extent or boundary is defined by the northern tree limit.

Study area. (See colour version of this figure online).
IV Methods
Here we briefly describe available data sets, classification (cluster analysis) and statistical analysis. We used eight remotely derived indicator data sets in the classification (encompassing land cover, topography, vegetation production and snow cover) to create the clusters and seven indicators to characterize the classified clusters post hoc with respect to habitat configuration (including aspects of fragmentation) and anthropogenic disturbance. In addition, we employed four species richness data sets (tree, mammal, bird and butterfly) to assess the utility of remotely derived indicators for characterizing richness patterns and to provide additional description of the classified regions. We assembled all data sets into a common 1 km spatial resolution grid that contained 4,604,910 pixels.
1 Data
a Land cover
The MODIS land-cover product includes five categorical maps derived from observations collected over a period of a year at a 1 km spatial resolution (USGS, 2011). We selected the University of Maryland (UMD) classification which was based on data from 2004. In total there were 14 classes, which included five forest classes (evergreen needleleaf forest, evergreen broadleaf forest, etc.) and two shrubland classes.
We also used the Ducks Unlimited Canada hybrid wetland layer (HWL) to identify the locations of water and wetlands. The HWL is a multi-source product that classifies Canada’s landscape, minus the Arctic, into the three general classes of water, wetland and upland (DUC, 2010). We used two freely available and national data sets to derive the HWL: land cover (GC/AAFC, 2009; Wulder et al., 2008a) and CanVec (GC/NRCan/ESS/MIB/CTI-S, 2011). The land-cover component was generated from Landsat imagery and covers Canada’s forested and agricultural areas and the Northern Territories. Forest cover data was produced by the Earth Observation for Sustainable Development of forests (EOSD) project, which classified the forested areas of Canada (23 land-cover classes) using over 480 Landsat-7 Enhanced Thematic Mapper Plus (ETM+) images and covering over 80% of Canada (Wulder et al., 2008a). The National Land and Water Information Service (NLWIS) of Agriculture and Agri-Food Canada (AAFC) created the agricultural coverage. The Canadian Center for Remote Sensing (CCRS) provided land-cover information for the Northern Territories. CanVec is a topographic product in vector format produced and maintained by Natural Resources Canada. To create the HWL, we processed the land-cover and rasterized CanVec data sets to a common raster format (HWL classification scheme and projection) and combined them into a single hybrid layer at a 25 m spatial resolution (DUC, 2010). We then classified the combined raster into the final HWL class values using a hierarchical classification scheme (DUC, 2010). We exported the HWL wetland class to its own raster layer and resampled it from a 25 m to a 1 km spatial resolution, where each cell represents the percentage of water and wetland.
b Topography (ruggedness)
We used two sources of remote-sensed topographic data: (1) the 90 m NASA Shuttle Radar Topographic Mission (SRTM) and (2) the USGS Global 30-Arc Second Elevation Data Set (GTOPO30). We utilized the GTOPO30 data set (∼1 km spatial resolution) for that portion of the study area above 60° North latitude. We used the coefficient of variation (CV) of elevation (i.e. ruggedness) to better differentiate topography between different environments across the boreal. The CV was computed at a 1 km neighbourhood distance for each elevation data set (i.e. a 10 and 1 pixel neighbourhood distance for SRTM and GTOPO30, respectively). We then resampled the SRTM CV values to a 1 km spatial resolution and combined with the GTOPO30 data set. High CV values indicate areas with extreme changes in elevation, whereas low CV values represent areas with minimal elevation differences.
c Vegetation productivity
We computed vegetation productivity by the Dynamic Habitat Index (DHI) of Coops et al. (2008). The DHI is comprised of three indicators of vegetation dynamics derived from six years (2000–2005) of monthly MODIS data: annual primary productivity, annual minimum cover and seasonal greenness. We calculated annual primary productivity by summing monthly fPAR observations over each of the six years of data (2000–2005). These six components were then averaged to create a long-term annual productivity indicator. Similarly, we summed the annual minima of monthly fPAR observations to calculate the annual minimum cover for each year. Areas that maintain some degree of vegetation over the year will have positive minimum cover values, while those that do not (e.g. areas predominantly snow-covered) will have values near or equal to zero (Coops et al., 2008, 2009b). We then averaged these six annual minimum cover components to produce a long-term annual minimum cover indicator. We computed seasonal greenness by first dividing the annual standard deviation of monthly fPAR observations by the annual mean value to obtain an annual CV, which we then averaged over the six years of data. Areas with variable climate or limited productivity will have high seasonal greenness values, while those that are consistently productive (e.g. evergreen forests) or experience less extreme climate conditions will have low values (Coops et al., 2008, 2009b). Annual primary productivity can be interpreted as the summed greenness over a year, annual minimum cover as the landscape’s capacity to sustain populations over a year (Schwartz et al., 2006), and seasonal greenness as the combined assessment of climate, topography and land use (Coops et al., 2009b).
d Snow cover
MODIS snow-cover products have been available since 2000, at different spatial and temporal resolutions (Hall and Riggs, 2007; Riggs et al., 2006). We used the global, 0.05° (∼5 km) spatial resolution monthly Terra snow-cover product MOD10CM (Hall et al., 2006). This product estimates a mean monthly snow-cover extent by averaging the daily fractional snow-cover (FSC) extents of the MODIS product MOD10C1 (Riggs et al., 2006). To assess seasonal snow cover across the study area, we computed two average monthly values for each cell for the years 2000 to 2010 during (1) the autumn (September to November) and (2) spring (March to May) months. We then resampled these two data sets to a 1 km spatial resolution.
e Disturbance and fragmentation
To characterize landscape condition and the imprint of disturbances, we used six metrics of landscape pattern and an anthropogenic disturbance index. The 1 km spatial resolution pattern metrics were calculated by Wulder et al. (2008b) using the 25 m spatial resolution EOSD cover product from the year 2000. We selected metrics of the total abundance and spatial configuration of forest patches within 1 km cells (Table 3).
Landscape pattern metrics used to characterize cluster groupings.
Source: Wulder et al. (2008b); Soverel et al. (2010).
The number of forest patches and the proportion of all patches that are forested, combined with the proportional area of forest in the landscape (the relative area metric) are, in combination, indices of relative forest fragmentation (Wulder et al., 2008b). For instance, if a landscape contains many distinct patches of which a high proportion is forested, then the forested area is divided into many patches separated by non-forest, indicative of forest fragmentation (Wulder et al., 2008b). Alternatively, a relatively low proportion of forest patches would imply that the landscape is fragmented with respect to non-forested habitats, but that the forested area it does contain is not necessarily fragmented forest. The six landscape metrics represent different types and spatial arrangement of forested and non-forested habitat patches in total and relative to each other. The number of forest patches, and the mean and standard deviation of forest patch size collectively describe the total area and the size distribution of forest patches within the landscape (Soverel et al., 2010). While not spatially explicit (McGarigal and Marks, 1995), edge density can also measure landscape fragmentation (Li et al., 2005).
To measure the degree of anthropogenic disturbance, we used the Global Forest Watch Canada’s Landsat (30 m spatial resolution) derived combined anthropogenic change mapping data sets (1990–2001) for areas within the provinces of Nova Scotia (Cheng and Lee, 2009), Saskatchewan and Manitoba (Stanojevic et al., 2006a), Ontario (Cheng and Lee, 2008), Québec (Stanojevic et al., 2006b) and British Columbia (Lee and Gysbers, 2008). These data sets do not cover the entire Canadian boreal, but are, to the best of our knowledge, the most complete data set of its kind for the study area. They identify and buffer human-caused forest disturbances such as road construction, agriculture, reservoirs and clearcuts. We converted them from their native polygon form to a 1 km spatial resolution raster. The anthropogenic index for each cell measures the percentage area impacted by human activity.
f Species data
To quantify species richness we used NatureServe’s digital distribution maps of mammals (Patterson et al., 2007) and birds (Ridgely et al., 2007) of the western hemisphere (version 3.0, available at http://www.natureserve.org/getData/animalData.jsp). These maps are primarily based on the US Defense Mapping Agency’s (DMA) Digital Chart of the world basemap (1:1,000,000 scale) and were derived from data sources dating from the years 1980–2000 (birds) and 1981–1999 (mammals). Mammal and bird species richness ranged from 0 to 16 and 0 to 91, respectively, per 1 km2 cell. US Geological Survey range maps of tree species in North America (http://esp.cr.usgs.gov/data/atlas/little; Little, 1999), derived from source maps at 1:10,000,000 scale, were used to produce a tree species richness layer that ranged in value from 0 to 45. Lastly, we created a butterfly species richness layer (5 km spatial resolution) using the butterfly specimen and observation data set provided by the Canadian Biodiversity Information Facility (http://www.cbif.gc.ca). We summed the species maps to produce a species richness layer for each taxon, which was then intersected with the study area grid.
2 Statistical analysis and classification (cluster analysis)
a Cluster analysis
Before classifying the boreal into clusters, we first examined the data distributions of the indicators (UMD land cover, wetland, ruggedness, vegetation production (DHI), spring snow cover and autumn snow cover) and assessed their correlation structure. We converted the indicator values within the 4,604,910 1 km cells into a table format and classified using a clustering procedure in PASW (Predictive Analytics SoftWare) Statistics 18 software (SPSS, Inc., 2010, Chicago, IL, version 18.0.2). A cluster is a set of cells or locations, not necessarily spatially contiguous, which share a range of distinct environmental conditions as described by the indicator variables. The clustering procedure imposes a tradeoff between precision and generality, which is determined largely by the number of clusters generated. Forming too few clusters means that considerably different kinds of environments are not distinguished. Forming too many clusters makes it difficult to identify trends or to describe environmental uniqueness in a useful way. We adopted the two-step approach of Zhang et al. (1996), which is able to handle large data sets with both continuous and categorical variables.
The first step involves a pre-clustering of the cells into a manageable number of subclusters, in this case 100. Pre-clustering is achieved through PASW’s two-step cluster classification, a sequential clustering approach that evaluates cells in sequence and uses a distance or nearness criterion to determine if the cell should be joined with the previous cluster or initiate a new cluster of its own (SPSS, 2001). Once we had the initial pre-clusters, we applied a hierarchical cluster classification in PASW to recursively merge the pre-clusters. We selected 15 clusters as they represent a level of organizational detail useful for aiding large area conservation planning within the boreal and commensurate with the 15 expert derived terrestrial ecozones regularly used in Canada (see Coops et al., 2009c, for further justification).
Both the cluster analysis steps use a distance measure as a criteria for merging clusters; we used the log-likelihood distance option provided in the PASW Statistics 18 software, which is compatible with continuous and categorical (i.e. land-cover) variables. The contribution of each indicator to the prediction of the cluster group membership was examined using a stepwise discriminant analysis in the Statistica software (StatSoft, Inc., Tulsa, OK, version 7.1). The final clusters were brought into a geographic information system (GIS) format for visualization and analysis via linking the classified table to the 4,604,910 1 km cells.
b Stepwise regression
Using a sample consisting of the mean pre-clusters values (n = 100), we also verified that the indicators were potentially useful predictors of species richness. Specifically, we constructed four multiple regression models in R software (R Development Core Team, 2010, Vienna, Austria, version 2.12.1) using mean pre-cluster species richness values as response variables and the mean pre-cluster environmental indicators values as independent variables. We first assessed environmental indicators (land cover, topography, vegetation production (DHI) and snow cover) with a Pearson’s correlation analysis before including any in the models, with precedence given to uncorrelated indicators with, as defined in the discriminant analysis, the largest contribution to the clusters’ creation. Models were constricted by forward stepwise selection, which sequentially adds predictive variables until the Akaike information criterion (AIC), a measure of relative model fit, is minimized (Akaike, 1974). To accomplish this, the ‘lm’ linear model and ‘stepAIC’ functions in the R software were used. We inferred relative importance of the selected covariate from their order of selection and their contribution to the models’ coefficient of determination (R2).
V Results
1 Correlations between indicators
Pearson’s correlation analysis of the environmental indicators (1 km spatial resolution) confirms the presence of some collinearity, indicating a certain amount of redundant information (Table 4). The highest correlations were 0.82 between annual primary productivity and seasonal greenness and 0.73 between the spring and autumn seasonal snow cover. These fell below the correlation threshold of 0.90 proposed by Kaufman and Rousseeuw (2005) and by Mooi and Sarstedt (2011), who indicated that variables correlated above this point are problematic and may be over-represented in the classification. Below this level, correlated pairs of indicators contain useful information for the cluster analysis; thus, none of the indicators were removed. The majority of pairwise correlations were significant (p<0.05), but of low magnitude, whether positive or negative. There were few cases of significant (p<0.05) moderately strong positive or negative correlations between seasonal snow-cover and productivity indicators (e.g. 0.45 to 0.60).
Pearson’s correlation analysis of environmental indicators.
2 Cluster analysis
To determine the relative importance of each indicator for discriminating between the final 15 clusters discriminant analysis was used with results shown in Table 5. Seasonal greenness proved to be the most important indicator, followed by the wetland and ruggedness indicators. Annual primary productivity was of least importance.
Discriminant analysis of cluster input variables.
The clusters are mapped in Figure 2 and primarily span from east to west, with a pronounced latitudinal gradient consistent with the relative levels of certain climate-related indices (Table 6). That is, most clusters capture the longitudinal similarity in landscape conditions as expected from energy, climate and vegetation drivers. There are two large clusters (7, 11) that occupy the middle of the boreal forest. Cluster 11 corresponds to the extensive area of wetlands known as the Hudson Bay Lowlands and contains the largest wetland component (75.6%) and is dominated by the ‘open shrub’ MODIS class. Cluster 7 is the largest cluster, spanning the entire length of the boreal and covering 20.9% of the study area. It is moderately productive, moderately seasonal and dominated by the ‘evergreen needleleaf forest’ class.

Spatial distribution of 15 environmental clusters within the study area, encompassing the Canadian boreal forest as defined by Brandt (2009). (See colour version of this figure online).
Description of the 15 clusters and the relative indicator ranking. Rankings were derived from mean indicator values per cluster and defined by the natural breaks (jenks) classification scheme.
The southernmost clusters (1–6, 10 and 12) are characterized by high productivity and lower seasonality. Their land cover is dominated by a combination of the ‘evergreen needleleaf’ and ‘mixed’ forest classes. The most distinct of these (cluster 3) is found in the Atlantic maritime and is defined by highly productive evergreen and mixed forest on drier, highly variable terrain, as one would expect of a largely managed forest land base (Wulder et al., 2008a). Both clusters 1 and 12 contain a large wetland component of 70.9% and 67.4%, respectively.
The northernmost clusters (8, 9, 13, 14 and 15) have low productivity and high seasonality. Their vegetation is dominated by the ‘open shrubland’ class. Cluster 14 represents the largest northern cluster (16.3% of the total study area). It spans the northern limit of the boreal forest, except where interrupted by Hudson’s Bay and the adjoining lowlands. Clusters 13 and 15 are distinguished by a relatively large annual minimum cover and wetland components respectively. Cluster 8 is distinct in that it mostly corresponds with alpine areas above the tree limit within the MacKenzie mountain range, and some apparent areas of tundra.
3 Cluster attribution with supplementary indicators
Table 7 gives a summary of the fragmentation and anthropogenic change present within each of the 15 clusters. The Canadian boreal is predominantly forested (Brandt, 2009; Wulder et al., 2008a); thus, clusters typically had a large forest cover component. In this case, there was a mean and maximum forest cover of 63% and 91%, respectively. However, there were five clusters (8, 9, 13, 14 and 15) with less than 49% forest cover. These occur in the northernmost part of the boreal and are dominated by the ‘open-shrub’ class. All five of these clusters had a high level of forest fragmentation typified by small and few forest patches with a high edge density. This result is indicative of the presence of isolated patches of forest within a matrix of shrubs and tundra vegetation as is found in these regions. With the exception of cluster 8, which primarily occupies the alpine, the fragmentation present in these areas is related to a lessened climatic suitability for forests with increasing latitude and the concurrent presence of wetlands and lakes (Wulder et al., 2008b, 2011). Thus, this fragmentation of forest is a natural state of these northern areas.
Description of the 15 clusters with anthropogenic change and forest fragmentation indicator mean values.
Only six clusters experienced an average anthropogenic change value greater than 10%. The largest of these are clusters 2 and 3, with mean anthropogenic footprint of 57.5% and 53.9%, respectively. Interestingly, areas with large anthropogenic change do not appear to coincide with areas that have high forest fragmentation. The southern cluster areas all have a large forest component that ranges from 54 to 91.1% forest cover.
The highest species richness was found in the southernmost clusters (Table 8). Here clusters 2, 6, 7 and 13 had the highest species richness averages for butterflies (0.18 species), trees (22.9 species), mammals (12.4 species) and birds (71.8 species), respectively. Conversely, northern clusters possess less species richness. For example, the northernmost cluster (13), which is situated near the northern tree limit or northern boreal extent, had the lowest tree species richness average (7.5 species).
Mean species richness per cluster.
4 Indicators as predictors of species richness
Based on the variables selected as important by the discriminant analysis and with low intercorrelation, three indicators (spring snow cover, wetland and annual minimum cover) were selected for assessing the relationship with biodiversity (tree, mammal, bird and butterfly). These models were highly explanatory for tree and bird species richness, and were moderately explanatory for butterfly species richness, but explained relatively little of the variance in mammal species richness (Table 9). The relative importance of the variables differed among taxa; however, spring snow cover was important in all models of all four taxa. In the butterfly model, one case fell outside of the ±3 times sigma (i.e. standard deviation of the residual) limit and was deemed an outlier and removed. Normal probability plots of all four models indicated that the relationships were approximately linear, and suggest a normal distribution of the residuals (Figure 3). However, there were more errors observed in the models explaining butterfly and mammal species richness.

Normal probability plots of residuals for each multiple regression model: (A) bird, (B) butterfly, (C) mammal and (D) forest.
Summary of stepwise regression–cluster input indicators and species richness (N=100).
Tree species richness was mostly explained by spring snow cover (Table 9). Spring snow cover accounts for 89.3% of the variance in tree species richness, and the inclusion of wetland and annual minimum cover resulted in an additional 1.8% and 1.0% variance, explained respectively.
Butterfly species richness was explained by spring snow cover and wetland. Spring snow cover accounts for 41.2% of the variance, with wetland explaining an additional 20.2% of the variance. Mammal and bird species richness were only explained by spring snow cover (Table 9).
In terms of the total variance explained for each taxon, the remotely derived indicators explained the most of the variance in the tree (92.6%), bird (84.0%) and butterfly (61.4%) taxa. A lesser proportion (22.6%) of mammal species richness was also explained. Overall, the stepwise multiple regression analysis indicated that the spring snow cover explained the most variance within each species richness type. Prediction of species richness was not markedly improved by including the annual minimum cover indicator.
VI Discussion
In this research we utilized a variety of freely available broad-scaled (1 km) remotely derived pan-boreal indicators for the characterization and monitoring of biodiversity within the Canadian boreal. Though choosing specific indicators for assessing biodiversity can be considered subjective, our selection of indicators was guided by many past studies (e.g. Duro et al., 2007). Our results indicate that metrics of seasonality, such as the spring snow cover, explained much of the variance in the species richness of three taxa of boreal flora (tree species) and fauna (birds and butterflies). This result is supported by others such as Hurlbert and Haskell (2003) who examined seasonal bird species richness across North America and remote sensing derived production (normalized difference vegetation index – NDVI) at different spatial (≤20,000 to 80,000 km2 grid cells) and temporal (seasonal) scales. Their findings show that seasonal NDVI in conjunction with habitat heterogeneity information (i.e. elevation relief) could explain the majority of bird species richness (69%), approximately 15% less than this study’s seasonal snow cover. They were also able to capture the seasonal dynamics of migrant species and demonstrate its importance for determining their numbers and proportions within breeding communities. This relation is particularly true in northern latitudes and in the boreal regions, where the migration of breeding species is dictated by the seasonal variation in available energy. For example, Ivits et al. (2011) demonstrated that remotely sensed total biomass, a measure of seasonal vegetation change over an area, is strongly correlated to species breeding in northern Europe and boreal regions located in Finland, Russia, Sweden and Norway. As a result, we conclude that indicators of seasonal snow cover may be indicative of the annual dynamics of a landscape and provide insight into, for example, the production of food availability, which will not be the same between winter and summer.
Hawkins and Porter (2003) identified potential evapotranspiration, a measure of current climate or energy input, as the strongest predictor of mammal and bird species richness in Canada, explaining 76% and 82% of the variance, respectively. Like the other taxa, spring seasonal snow cover explained the most variance for mammal species richness, but in the case of mammals it was relatively lower at 22.67%. Considering the variety of boreal mammalian species types – e.g. Wolverine (Gulo gulo), Northern River Otter, Caribou (Rangifer tarandus), Gray Wolf (Canis lupus) – used to quantify the species richness variable and their diverse life histories and associated behaviours, it is likely that the importance of spring snow cover as a habitat feature varies greatly. As a result, while spring snow cover may be a useful predictor of certain species – e.g. Gray Wolf (Canis lupus) and Wolverine (Gulo gulo) – it may be less effective for mammalian species richness overall. In the case of butterflies, Kerr et al. (2001) were able to explain more variation for species richness in Canada than this study (>90% versus 61.49%). This difference could be explained by the sampling density and number of butterfly records within the boreal forest compared to Canada as a whole. Highest species richness for butterflies occurs in the very south of Canada, away from the boreal study area and, as a result, the lower sampling densities within the boreal may reduce the predictive capacity of the regression. In addition, this study did not incorporate landscape heterogeneity, the variable identified as the strongest predictor of butterfly species richness for Canada (Kerr et al., 2001).
We applied a quantitative regionalization approach (i.e. cluster analysis) in conjunction with key remotely sensed pan-boreal indicators to delineate the boreal forest into 15 clusters. An important advantage of this quantitative approach is that it minimizes the internal environmental variability within the cluster groupings. Consequently, this approach can result in a robust and consistent stratification for monitoring and has been shown to produce more representative ecosystem descriptions than qualitative delineation approaches (Leathwick et al., 2003). Since a quantitative cluster analysis forms nested or hierarchical clusters, it can also be flexibly used to produce classes at different levels of detail, depending on the application requirements, while maintaining functional continuity among levels (Leathwick et al., 2003; Lugo et al., 1999). To date, many national/continental quantitative regionalization studies have demonstrated the potential of such approaches for addressing a host of conservation planning problems over a range of diverse environments, for example in Australia (Mackey et al., 2008), New Zealand (Leathwick et al., 2003), Europe (Metzger et al., 2005) and Canada (Coops et al., 2009c). Within the boreal forest, the clusters defined within the paper could form a basis for an appropriate stratification, highlighting areas of unique conservation value, and ultimately be used to help identify any deficiencies in current park networks.
When we specify the cluster analysis to produce 15 clusters, it resulted in clusters of varied size, averaging around 6.6% of the Canadian boreal, with each cluster relating to a distinct set of environmental conditions. Seasonal greenness and the wetland indicators were the most important for discriminating between cluster groups. The spatial delineation of the clusters reflects a latitudinal gradient. Accordingly, seasonal greenness is highly related to climate conditions, meaning cluster boundaries are also defined largely in part by north to south vegetation productivity gradients. Likewise, there were areas within Canada’s boreal that contain a large wetland component, which, with its distinctive attributes, explains why these areas were differentiated. The similarities between the clusters generated in this study and the 14 Canada-wide clusters generated by Coops et al. (2009c) are visually apparent. Both exhibit a latitudinal gradient and span from east to west and the northern clusters have a similar size and distribution. Apart from the different extents, there are a greater number of clusters from this study located in the southern boreal and the Hudson’s Bay region. The additional clusters generally nest within the broader clusters of Coops et al. (2009c).
Clusters located in the northern reaches of the Canadian boreal are sparsely forested, leading to higher levels of forest fragmentation than found nationally. Similar findings were also noted by Wulder et al. (2008b), who, when employing the same landscape pattern (fragmentation) metrics, found that sparsely forested ecozones (Taiga Shield, Taiga Cordillera) were characterized by numerous small forest patches juxtaposed with low vegetation, wetlands and lakes. For those areas where data was available, the majority of anthropogenic activity, such as timber harvesting and road construction, occurs in the southern boreal regions. However, these areas are also characterized as having a large forest cover component, which acts to offset the relatively small (spatially) anthropogenic disturbances. This finding suggests that anthropogenic disturbances may not be effectively generalized within large spatial clusters, but rather should be reported in smaller spatial units or evaluated in regional analyses (Wulder et al., 2008b).
VII Conclusion
Conventional field-based approaches used for assessing biodiversity, while important, are spatially confined and cannot be reasonably conducted over large areas (e.g. national level). Where biological data is absent or incomplete, remote sensing techniques have the capacity to provide valuable spatially exhaustive information on certain aspects of biodiversity. We assessed a suite of remotely derived environmental indicators for characterizing biodiversity within Canada’s boreal and demonstrated the applicability of a boreal-wide quantitative environmental classification or clustering approach. Of the three indicators tested, seasonality, as defined by spring snow cover, was shown to explain the most variance of species richness (bird, tree and butterfly) and, therefore, indicates that it could be a key indicator of biodiversity. We found that the 15 clusters generated from the cluster analysis were representative of a range of environmentally distinct conditions, and that seasonal greenness, along with wetland land cover, were the most important indicators for differentiating between the clusters. The addition of forest fragmentation indicators provided useful information for describing the forest extent (composition) and spatial characteristics (configuration) of the clusters; however, anthropogenic disturbance was not associated with higher levels of fragmentation. We conclude that remotely derived indicators in conjunction with the quantitative cluster analysis and attribution, with both improved interpretability and added information content, have the capacity to provide useful insights for conservation planning in the Canadian boreal forest.
Footnotes
Acknowledgements
This research was undertaken as part of the ‘BioSpace: Biodiversity monitoring with Earth Observation data’ project jointly funded by the Ivey Foundation, the Nature Conservancy of Canada, Canadian Space Agency (CSA) Government Related Initiatives Program (GRIP), Canadian Forest Service (CFS) Pacific Forestry Centre (PFC) and the University of British Columbia (UBC).
