Abstract
Forest Explorer is a web tool that can be used to easily browse the contents of the Cross-Forest dataset, a Linked Open Data resource containing the forestry inventory and land cover map from Spain. The tool is purposed for domain experts and lay users to facilitate the exploration of forestry data. Since these two groups are not knowledgable on Semantic Web, the user interface is designed to hide the complexity of RDF, OWL or SPARQL. An interactive map is provided for this purpose, allowing users to navigate to the area of interest and presenting forestry data with different levels of detail according to the zoom level. Forest Explorer offers different filter controls and is localized to English and Spanish. All the data is retrieved from the Cross-Forest and DBpedia endpoints through the Data manager. This component feeds the different Feature managers with the data needed to be displayed in the map. The Data manager uses a reduced set of SPARQL templates to accommodate any data request of the Feature managers. Caching and smart geographic querying are employed to limit data exchanges with the endpoint. A live version of the tool is freely available for everybody that wants to try it – any device with a modern browser should be sufficient to test it. Since December 2019, more than 3,200 users have employed Forest Explorer and it has appeared 12 times in the Spanish media. Results from a user study with 28 participants (mainly domain experts) show that Forest Explorer can be used to easily navigate the contents of the Cross-Forest dataset. No important limitations were found, only feature requests such as the integration of new datasets from other countries that are part of our future work.
Introduction
Forest science and forestry rely on the use of large-scale datasets that cover lengthy periods of time [21]. This is because trees are long-lived organisms that require continuous monitoring to obtain sound and accurate information. Similarly, large-scale monitoring systems are used to capture the complexity and structure of forests in their full extend. Permanent and extensive data recording systems, particularly land cover maps and forest inventories [31], are needed to implement sound sustainable forest management [3,24,25] to ensure a constant flow of ecosystem services such as habitat conservation and raw materials (timber, resin, cork…).
Due to long-term scale of forestry actions, the private sector has no incentives to conduct this type of data collection and curation. As a result, the public sector is the main responsible for monitoring forests and providing this information to society. Such information is consumed for different purposes by diverse end users, including forest stakeholders (governments at different levels, environmental NGOs and other lobby groups), operational foresters, data and environmental journalists, interested citizens, and start-up promoters.
However, exploiting forest inventories and land cover maps is a non-trivial task that requires both domain expertise and technical skills. Forestry datasets are typically isolated, described with disparate data schemas, and using unfamiliar (sometimes even proprietary) formats. Given these limitations, Linked Open Data and Semantic Web technologies can help to facilitate the integration and accessibility of forestry data. Towards this goal, the EU project Cross-Forest1
The Cross-Forest dataset is available as Linked Data through the SPARQL endpoint at
In this paper we present Forest Explorer, a web tool for easily accessing the contents of the Cross-Forest dataset. Forest Explorer is designed for non-Semantic Web experts, so the user interface is based on an interactive map that completely hides the manipulation of RDF data and exchange of SPARQL queries behind the scenes. The rest of the paper is organized as follows: Section 2 reviews exploration tools of geospatial Linked Data, as well as existing approaches for visualizing forestry data. Section 3 presents the functional requirements of Forest Explorer. Section 4 describes the design and implementation of the tool. In Section 5 we give evidence of the impact of Forest Explorer, including the results of a user study. The paper ends with a discussion and future work lines in Section 6.
Since the earlier years of the Semantic Web, the challenge of exploring RDF data was evident [13]. Preliminary projects in this area were targeted to either Semantic Web experts or technology enthusiasts willing to put in the time to learn. More recent works [9] have stressed the importance of supporting both lay users and domain experts, e.g forest managers. Such users may not have any knowledge of SPARQL, OWL, or RDF and require appropriate tools to work with Linked Open Data.
There are some Linked Data browsers, visual query builders, and exploration tools that do not require expert knowledge of Semantic Web technologies [8,9,14]. A recent example is RDF Surveyor [33], a lightweight exploration tool targeted to lay users that is part of our previous work. Unfortunately, most of the exploration tools available have limited support for geospatial Linked Data if any, e.g. the visualization of an entity in RDF Surveyor includes a geo widget with a marker if a point location annotated with the Basic Geo Vocabulary2
Some systems have been designed to support the visualization of geospatial Linked Data. Again, many proposals are targeted to Semantic Web experts: GeoYASGUI [1] is a SPARQL editor that natively supports GeoSPARQL and provides a result set visualizer; Sgvizler [28] is a JavaScript library that can produce different charts – including maps – with the results of SPARQL queries; and Sextant [18] is an advanced visualization tool that can combine spatial data from several endpoints and represent the temporal dimension, although it still requires knowledge of SPARQL in order to use it. There are seldom visualizers of geospatial Linked Data for non-Semantic Web experts: LinkedGeoData browser [29] is a dedicated visualization tool for OpenStreetMap; and Map4RDF [10] is a browsing tool of geospatial RDF datasets that uses a faceted interface to control the information to display.
In the forestry domain, we can find several initiatives at national or regional level focused on the delivery of raw data, but not using Linked Data and Semantic Web technologies. A remarkable example at pan-European level is the EFISCEN database portal [27] that offers forestry datasets from 32 countries, but using disparate schemata and data formats. National data portals are common, although the typical case is downloadable raw data in proprietary format (e.g. Spanish forest inventory3
Visualization of forestry data is commonly achieved through a Geographical Information System (GIS). For example, Global Forest Watch5
Other forestry tools focus on the analysis of data. This is the case of BASIFOR7
The Spanish forest inventory [2] is a continuous dataset that is updated every 10 years. A plot is an homogeneous and small area of the territory that constitutes the sampling unit. Forest technicians survey plots to gather tree data (location, species, diameter, and height). The current version of this inventory accrues 1.4M trees, 91.9K plots, and 4.3M positions. The Spanish land cover map8
Feature types and feature data of the Cross-Forest dataset
The Cross-Forest dataset integrates the latest versions of the Spanish forest inventory and land cover map in Linked Data format, thus fulfilling one of the main goals of the Cross-Forest project. Provinces, patches, plots and trees are modeled as spatial features with a geometry and a set of measures extracted from the corresponding source. The details of the main features of each dataset are shown in Table 1. Their positions are represented using a simple ontology that indicates the Coordinate Reference System and the coordinates of the position. This ontology makes safe reuse [7] of relevant geographical ontologies, including GeoSPARQL [20], the W3C Basic Geo Vocabulary [5], and the ISA Programme Location Core Vocabulary [19].
The original datasets includes features with absolute positions using UTM coordinates, but some features, namely trees, are only annotated with relative positions, using plot centers as reference. When making the conversion into RDF we have enriched the original data with the inclusion of WGS84 positions for every feature, including trees. UTM coordinates and relative positions are still available in the Linked Open Data version of the datasets, so there is no impact for users of the original data sources. We used Proj4js9
The transformation of the land cover map into RDF includes the creation of a new layer of patches in a lower resolution in order to make efficient the drawing of polygons on top of a Web map (see Best Practice 4 in [30]). The taxonomical structure of trees and bushes (including class, genus, family, and species) is linked to relevant datasets in forestry and biology fields: the NCBI taxonomy [12] and the CrossNature dataset.11
Statistics of the Cross-Forest dataset
We identified a target group interested in the resulting integrated dataset: domain experts such as forest managers and operational foresters that have to invest a lot of time with manual integrations or using forest analysis tools like BASIFOR [4] due to strong coupling to schemata, legacy formats, or limited tool scope. For instance, [23] reports a significant effort integrating the data of three provinces of the Spanish forest inventory (this dataset is sliced in 50 files, one per province) – note that this is a simple integration case as the database schema is the same for the three provinces and no other data sources were required. Moreover, analyzing forestry data from different countries is a time-consuming and difficult task, requiring schema harmonization and data conversions. Another target group corresponds to lay users such as data journalists or citizens interested in forestry. These two groups are not knowledgeable about Semantic Web technologies, so they need a tool that shows the data while hiding the complexities of its representation. As a result of a collaborative design effort among two forestry experts and two Semantic Web practitioners (the authors of this paper), we identified the following requirements:
In this section we present the tool devised to easily access the contents of the Cross-Forest dataset. The proposed tool is Forest Explorer and satisfies the requirements described in Section 3. The following subsections dive deep into the design and implementation of Forest Explorer.
Logical architecture
The logical architecture of Forest Explorer is depicted in Fig. 1. The Map generator is in charge of displaying the view for the end user. This component employs a base map obtained from the Map server and listens to the requests made from the different Feature managers for showing markers, polygons, popups, or tooltips on top of the map. More specifically, the Province manager gathers province geometries and forest inventory data aggregated by province and prepares a suitable display request to the Map generator. Patch, Plot, and Tree managers work in a similar way, obtaining first the information about the corresponding features (see Table 1) located in the geographical bounds of the current map view, and then sending display requests to the Map generator.

Logical architecture of Forest Explorer.
In order to comply with requirement R4, the different Feature managers are activated depending on the zoom level. In case of large areas, the user can choose between province or patch views (R5). In the former case, the Province manager takes control and requests the presentation of inventory data aggregated by provinces. In the latter case, the Patch manager is activated and requests the lowest resolution layer available of the land cover map. With intermediate zoom levels, the Patch and Plot managers are both enabled to present the plots on top of a land cover map of the visible area. With high zoom levels, the Patch and Tree managers are activated – the highest resolution layer available of the land cover map is employed in this case, while the Tree manager requests the presentation of tree markers to the Map generator based on the inventory data for the target area. Note that this design is compliant with Best Practice 4 in [30], recommending to provide multiple resolution versions of features at different zoom levels.
The Data manager handles all the data requests from the Feature managers. This component is able to communicate via SPARQL and is configured to use the Cross-Forest and DBpedia endpoints. As described in Section 3, the Cross-Forest dataset contains a Linked Open Data version of the Spanish forest inventory and land cover map. DBpedia is employed as a source of tree species information, providing images and multilingual descriptions (R6). Upon receipt of a request, the Data manager first checks if the result is already available in the Data cache. In case of a miss, the Data manager sends one or more SPARQL queries to the endpoints. Section 4.2 gives further details of the functioning of the Data manager.
As depicted in Section 4.1, the Data manager is in charge of all data gathering operations in Forest Explorer. The design of this component is challenging due to a number of reasons: (1) requests look quite varied, referring to different types of features (provinces, patches, plots, and trees) and all their associated data; (2) the size of the Cross-Forest dataset is not small – 4.3GB corresponding to 73.7M triples; and (3) requests can be very numerous since any change in the interactive map will trigger data requests by one or more Feature managers.
Fortunately, the different request types can be abstracted to the identification of features localized in a specific area and then obtaining their corresponding feature data. The Data manager uses SPARQL template queries for retrieving the features localized in the map view. Since plots and trees have points as geometries, a suitable query basically has to check which points are included in a bounding box. Listing 1 shows the template used for plots in which

SPARQL query template for retrieving plots in the map view
As patches and provinces have polygons as geometries, queries have to be adapted to find the polygons intersecting with the map view. The template used for patches is included in Listing 2. Note that we use the bounding box of the polygon (included for all polygons in the dataset) to simplify the detection of patches in the map view. In order to comply with requirement R4, the template is parametrized to select a specific
The assumption here is that patches of less than 10 pixels are barely visible so it is better to not waste time and computation resources with them. The area of a pixel changes with the zoom level, so a patch discarded for being too small can be later displayed in case of zooming in.

SPARQL query template for retrieving patches in the map view
Once the Data manager obtains the set of features in the area of interest, it retrieves the corresponding data in a next step. Listing 3 shows the query template used for this purpose; it is extremely generic and easily adaptable to each feature type. For example, obtaining the height of a collection of trees just requires setting their IRIs and the height property IRI defined in the Cross-Forest ontology. As a result, gathering feature data is just a matter of selecting the set of properties to extract for each feature type (see Table 1). Class membership is required in some cases – for example to obtain tree species – so we have included another generic template for retrieving the classes of a set of individuals.

SPARQL query template for obtaining the values of a property for a set of individuals

The Data manager limits query exchanges with the endpoint by keeping track of the map regions with localized features. The map is initially unknown to the Data manager (a), so a feature location request for region
The Data manager stores all retrieved data in a Cache so as to reduce exchanges with the endpoint. This is quite effective with feature data, since the template query in Listing 3 can be easily parametrized to avoid requests of feature data already available in the Cache. With respect to feature locations, a similar approach consists on caching the results for every map view bounds request, i.e. queries built with Listing 1 for plots, Listing 2 for patches, and so on. However, this solution is too naive as the cache will only get a hit in case of a request with exactly the same map boundaries as a previous one. Instead, the Data manager keeps track of the map regions with localized features and exploits this information to restrict queries to unknown areas. Figure 2 illustrates the idea:
At the beginning the Data manager has no information of any part of the map.
Upon receiving a feature location request for region
A request about region
A request about region
In order to comply with requirement R3, the user interface of Forest Explorer exposes an interactive map as its main component. Figure 3 shows some snapshots that illustrate the design of the user interface. Since the tool has to work with a diversity of devices (R1), the map is set in fullscreen mode to easily adapt to different screen sizes. Similar to other map applications, panning and zooming are naturally supported for both point-and-click and touch interfaces. In this regard, zoom buttons are included in the lower-right corner; the extra button with a person icon is used to navigate to the user location – this latter functionality is especially useful when employing Forest Explorer with a mobile device in a field trip.

Snapshots of the user interface of Forest Explorer. (a) View of the Spanish provinces in a large area (see the map scale in the lower-right corner) with the form in the upper-left part expanded and showing two species filters, Pinus sylvestris in indigo color and Pinus pinaster in brown; inventory tree data for the province of Soria is displayed in a tooltip. (b) View of the patches in a large area of Spain with the form in the upper-left part collapsed; patches are plotted in different colors depending on their use (farms in orange, water in blue, artificial in grey, and forests in green); forest patches containing a filtered species use the color filter (indigo and brown in this running example); a pop-up shows the data of a forest patch in the province of Soria. (c) View of a small forest area (see the map scale in the lower-right corner) in the province of Soria; plots are displayed as circles on top of the patches; plots and patches employ the same color code as before; a tooltip shows inventory data for a plot. (d) View of a tiny small forest area corresponding to the center of a plot in Soria; tree markers are shown in their actual positions using taxa-dependent icons and corresponding filter colors; a tooltip shows the species, height, and diameter of a specific tree.
As the user navigates with the map, a Feature manager takes control by obtaining feature information from the Data manager and then sending display requests to the Map generator (see Section 4.1). For example, the Province manager controls the view of Fig. 3(a); the Patch manager is activated in Fig. 3(b); the Patch and the Plot managers collaborate to build the view in Fig. 3(c); and the Tree manager is active in Fig. 3(d).
Beyond map navigation, the user interface needs to include controls to allow the user to adjust the information to display (R5). For this purpose, Forest Explorer includes a form in the upper-left part – see Fig. 3(a). This form can take a significant part of the screen in mobile devices, so it can be collapsed by pushing the ‘-’ button – the collapsed form is shown in Fig. 3(b)(c)(d).
A key characteristic of the dataset is the tree species of land cover map and forestry inventory. Thus, the form includes a ‘Filter species’ button that can be pushed to easily browse the taxonomy of tree species (obtained from the Cross-Forest ontology) and then select one or more filters. The form in Fig. 3(a) includes two species filters, Pinus sylvestris in indigo color and Pinus pinaster in brown. Each filter includes buttons for removal (‘x’ icon), color change (‘tint’ icon), and more info (‘info’ icon) – the latter one displays a popup with additional information about a tree species such as an image and a localized description obtained from DBpedia. Species filters have a significant impact in the map view: the color code of species filters is applied to the different features, while species data is also displayed in the different tooltips and popovers – see the snapshots in Fig. 3.
In addition, the form includes a textbox for searching places (obtained from the GeoNames dataset14
Forest Explorer is coded in JavaScript to facilitate its deployment as a web application. As a result, the tool can be used in any device with a modern web browser.15
We have tested Forest Explorer with the latest versions of Mozilla Firefox and Google Chrome in a variety of mobile phones, tablets, and desktop computers.
The code is organized in several files that reflect the logical architecture in Fig. 1. The implementation effort has been considerably reduced by the integration of a number of JavaScript libraries. We use Leaflet16
The form of the user interface is built with Bootstrap20
As for the Data manager component, all the queries are included in a separate file using templating parameters as necessary, e.g. Listings 1, 2, and 3. We also use a mapping file that assigns keys to ontology IRIs; the Data manager only references the keys and this level of indirection decouples the implementation from the Cross-Forest ontology as a result.
Since the tool needs to be localized to English and Spanish (requirement R6), the Cross-Forest dataset is localized to these languages and all the labels employed in the user interface are included in a multilingual strings file. When the tool starts, it gets the browser language preferences in order to select the session language that is then applied to every text shown.
The source code of Forest Explorer is available on GitHub.25
We submitted a preliminary prototype of Forest Explorer to Desafío Aporta 2019,27
http://www.diariodevalladolid.es/noticias/innovadores/arboles-golpe-clic_170743.html and http://l.e.eleconomista.es/rts/go2.aspx?h=260137&tp=i-H43-Dc-6T8-FyPmJ-1c-1gVE-1c-FyOR8-1NwpSj
In January 2020 we used our social networks to announce the release of Forest Explorer. Specifically, we sent 8 tweets in Twitter that received over 6,500 impressions, more than 100 likes, and about 30 retweets. We also prepared three short posts in Facebook and one more in Twitter, receiving over 1,000 and 375 impressions, respectively.
Beyond media coverage and social networks, we used Google Analytics to assess the uptake of Forest Explorer. Tracked data was obtained from the live site in October 2020. About usage, more than 3,200 users have accessed the tool and have carried out over 5,500 sessions with an average session time of 4 minutes and 11 seconds. 85% of the traffic comes from Spain and the preferred language was Spanish in 82% of the sessions – this is not surprising, as the dataset only covers Spanish forests. Google Analytics gives also information about the employed devices: 43% were Windows computers, 38% Android mobiles or tablets, 12% iOS devices, 6% Mac computers, and 1% Linux computers.
We have carried out a user study of Forest Explorer that was promoted by the Cross-Forest consortium – we participated in the preparation of the questionnaire, but we did not reach potential respondents. The invitation to take part in this user study included a link to Forest Explorer to test it, as well as the link to the questionnaire. This was divided in four parts: (1) User profile, (2) Dataset assessment, (3) Usability through the standardized System Usability Score (SUS) [6], and (4) User feedback.
Profiles of the participants in the user study (first part of the questionnaire)
Profiles of the participants in the user study (first part of the questionnaire)
It was possible to select multiple options to this question; 86% of the participants chose one of the three options given and 14% marked two.
28 participants have answered the questionnaire so far – information about their user profiles is summarized in Table 3. Note that most of them can be classified as forestry domain experts (93%), while the remaining 7% correspond to the user group of lay users. Moving to the data assessment section, participants rated the usefulness of the data exposed with an average figure of 3.9 and a standard deviation of 0.8 in a scale from 1 (not useful) to 5 (very useful). They were also asked about the understandability of the data exposed (from 1, not understandable, to 5, very understandable): the average was 3.9 and the standard deviation 0.9. Participants were also asked to optionally propose new datasets to be integrated: four suggested the inclusion of orthophotos as base map; three proposed the inclusion of additional forestry data (combustibility, biomass, carbon sequestration); two proposed the inclusion of altimetry data; and another proposed the inclusion of previous editions of forest inventories.
Regarding usability, the computed SUS score was 75 in average with a standard deviation of 16. This figure is good, given that SUS scores range from 0 to 100. According to the grading scale interpretation of SUS scores in [26, ch. 8], Forest Explorer was graded with a B. Interestingly, the group of Spanish respondents gave higher scores (81 in average and an A in the aforementioned grading scale) than the participants from other countries, mainly Portugal, with a SUS score of 69 in average.
The fourth part of the questionnaire began with a question about the likeliness of recommending Forest Explorer to a friend or colleague (from 1, not likely at all, to 5, very likely); the average was 4.0 with a standard deviation of 0.9. This block also contained two open-ended questions about what liked most and liked least of Forest Explorer. We analyzed the results by identifying several themes and categorizing the answers. The main findings are included in Table 4 with their level of support and a sample comment for each finding – note that we only include a point if supported by at least two participants, so as to avoid irrelevant or spurious judgments.
Strengths and weaknesses of Forest Explorer (fourth part of the questionnaire)
With respect to the tool strengths, 43% of the respondents stressed its ease of use and 21% its usability – connected to requirement R2 and one of the main goals of Forest Explorer. 21% considered the tool fast – a good sign given the effort on efficient data gathering (see Section 3.2). Four participants included positive comments about the integration of forest databases through Forest Explorer. Three liked the search capabilities of the tool – connected to requirement R5. On the negative side, we obtained very useful feedback for improving Forest Explorer. 18% of the respondents missed Portuguese data – this limitation may partially explain the lower SUS scores of Portuguese participants, identified above. New feature requests include a downloading data functionality, providing background maps (see also the analysis of the second part of the questionnaire above), and additional information like a user manual or provenance data. Besides, some participants reported speed problems and minor aesthetics modifications.
All in all, these results are generally positive, supporting the design goals of Forest Explorer. Participants mainly correspond to the group of forest domain experts. They were able to use the tool for exploring a complex semantic dataset integrating forest inventories and land cover maps, and considered Forest Explorer easy to use. Usability was ranked good – this is consistent with SUS scores and answers to open-ended questions. No important limitations were found, they were mostly feature requests that are quite useful to guide the development of new versions of Forest Explorer.
Forest Explorer is designed to work with the Cross-Forest dataset, thus benefitting from the use of Semantic Web technologies for data integration. Indeed, this advantage was identified in the user study (see Table 4), although participants still demanded additional sources to be included – it looks data integration is much needed in the forestry domain. In this regard, we are currently working on the conversion of the Portuguese forest inventory and land cover map into Linked Open Data. Once included in the Cross-Forest dataset, they will be automatically browsable through Forest Explorer. Despite the schemata of the original Portuguese and Spanish sources are different, the overarching ontology homogenizes the terminology, so no further changes will be required in Forest Explorer.
Moving on to the technical design of Forest Explorer, the Data manager is the component in charge of gathering data from the Cross-Forest and DBpedia endpoints (see Fig. 1). The solution devised relies on the use of SPARQL query templates to facilitate the extensibility of Forest Explorer. As an example, the template query in Listing 3 is employed for gathering every type of feature data and also for obtaining images and tree species descriptions from DBpedia. Similarly, the Data manager uses a query template for obtaining the taxonomy of subclasses of a target class; this is employed with species and soil uses. Furthermore, Section 4.2 already describes the query templates used for retrieving plots and patches, as well as the Cache proposed for reducing query exchanges with the Cross-Forest endpoint. Such query templates can be abstracted away to retrieve arbitrary features in order to extend the applicability of Forest Explorer to other contexts.
Geospatial queries in Forest Explorer are handled with the SPARQL templates in Listings 1–2. They are simple interval queries that run quite fast and provide the necessary expressiveness for the retrieval of features in a bounding box. Alternatively, GeoSPARQL functions such as
The different Feature managers in Forest Explorer are bound to the corresponding feature types, i.e. province, patch, plot, and tree. They can be easily modified so as to prepare new labels for tooltips, to request different markers to be rendered by the Map generator, etc. Moreover, new Feature managers can be added to the system; the recommended way is (1) to use an existing Feature manager as a base and (2) to update the Data manager for gathering feature data – existing query templates should be sufficient in most cases.
To wrap up, Forest Explorer is a novel exploration tool of forestry Linked Open Data designed for non-Semantic Web experts. Users can interact with a map to navigate to the area of interest and to control the information to display. To limit data exchanges with the SPARQL endpoint, Forest explorer uses a cache and smart geographic querying. So far, we have attracted the attention of the forestry community, not very familiar with Semantic Web technologies. Our future work includes the integration of new data sources from other countries in the Cross-Forest dataset, beginning with Portugal. Other geolocated data can be integrated such as cadastral parcel information, digital terrain models, land use maps, remote sensing layers and hazard occurrence or vulnerability (forest fires, pests and diseases, or blown down damages). We also plan to introduce data analysis capabilities to support forest management and planning scenarios. In this way, Forest Explorer could provide similar functionalities as BASIFOR, but with strong visualizations and taking advantage of Linked Open Data for forestry data integration.
Footnotes
Acknowledgements
This work has been partially funded by the European Commission through Cross-Forest (2017-EU-IA-0140) and Spanish Ministry of Science and Innovation through REFORM (PCIN-2017-027, ERANET SUMFOREST) projects. The authors would like to thank the reviewers for their constructive and insightful comments that helped to improve this paper.
