Discovering duplicate and related resources using an interlinking approach: The case of educational datasets

Abstract

Linking a learning dataset to useful information on the Web of Data enriches its learning resources, as it enhances learners’ knowledge. This enrichment is usually achieved by creating links between datasets using the interlinking tools, which facilitate connecting any kind of data in a semi-automatic manner. This paper evaluates the interlinking results between an e-learning repository and several educational datasets on the Web of Data, which leads to enrichment of the contents. Many related resources were discovered during this experimentation already matched to the GLOBE learning objects. Furthermore, this research presents a data model to find similarity between two datasets and a workflow to identify the duplicate resources by performing a semi-automatic evaluation process. A case study was also assessed by human experts.

Keywords

Duplicate e-learning resource GLOBE interlinking LIMES linked data

1. Introduction

The Semantic Web, as a collaborative movement led by the World Wide Web Consortium (W3C), promotes common data formats for publishing data on the World Wide Web. The aim of Semantic Web is to convert the current Web, dominated by unstructured and semi-structured documents, into a ‘Web of Linked Data’. It also facilitates the sharing and availability of different kinds of information on the Web. In particular, the Linked Data approach [1] has emerged as the de-facto standard for integrating data on the Web. It offers significant potential to tackle the interoperability issues in different contexts. In an e-learning context, for example, Linked Data enhances the discovery of Open Educational repositories contents established by the educational institutions [2] and connects the learning objects to useful knowledge on the Web [3, 4]. Data connectivity in Linked Data is performed by providing RDF links between two entities – so-called interlinking.

The Linked Data applications have also facilitated data enrichment by applying several techniques for automatic and intelligent linking. In particular, an interlinking tool establishes links between different datasets on the Web by discovering similarities among their entities. It also helps data publishers to connect their contents to useful datasets. In this paper, we evaluate the outcomes of an interlinking approach on a large learning repository, Global Learning Objects Brokered Exchange (GLOBE) [5], by applying a promising interlinking tool. It connects the GLOBE resources to 20 educational datasets in the LOD cloud. We also assess the matched links to answer the following research questions:

How are the GLOBE resources distributed in each target dataset and which datasets include more similarities with GLOBE?

What are the benefits of this interlinking when a large learning dataset is linked to several educational datasets on the Web?

How can the related and duplicate resources be identified by applying the interlinking approach?

The rest of the paper is structured as follows. Section 2 describes the importance of interlinking on the Web of Data and outlines the current studies in this context. In Section 3, we will present the proposed approach for interlinking and finding the duplicates. Section 4 will discuss the interlinking results and evaluate the approach using a case study. Finally, conclusions are presented in Section 5.

2. Background and related work

Interlinking tools perform the creation of links semi-automatically and connect two datasets using different kinds of links (e.g. owl:sameAs) by similarity discovery among the entities. Most of these applications follow a similar routine to carry out the interlinking process. For example, in Silk [6], a user should set the following information to run the tool:

source and target datasets;

source and target entities (e.g. resource title in the source dataset to book title in a library);

criteria under which two entities are matched.

Given the criteria above, the software discovers similarities between pairs of entities and generates a set of results. Several studies have been undertaken in recent years to investigate the interlinking issues in the Linked Data context. Simperl et al. [7] have compared various linking tools by addressing the important aspects such as required input, resulting output, considered domain and matching techniques used. This comparison was applied from two specific perspectives: degree of automation (to what extent the tool needs human input) and human contribution (the way in which users are required to do the interlinking). Scharffe and Euzenat [4] also proposed a framework for data interlinking applied in different systems in which several linking tools were discussed. In a technology-enhanced learning context, Dietze et al. [3] documented an approach for interlinking educational resources based on the Linked Data principles [1] and exploiting the abundance of existing data on the Web. Several Linked Data projects such as LinkedUp [8] and Linked Education [9] have also been aimed at advancing the exploitation of the vast amounts of public, open data available in an educational context. In another empirical study, Rajabi et al. [10] applied two matching techniques to interlink a semi-structured dataset to the Web of Data and discussed the generated results in details. In the context of Open Educational Repositories, Piedra et al. [2] applied the Linked Data principles to interoperate and mash-up data from distributed and heterogeneous repositories of open educational materials. The same author, in another study [11], leveraged the principles of Linked Data to enhance the discovery of Open Course Ware (OCW) contents created and shared by the universities. The authors also developed a query method to access the OCW data using linked data techniques and linked the contents to the LOD cloud.

The aforementioned studies have demonstrated that several fundamental works have been carried out in this direction and data publishers can trust the interlinking tools to interconnect their contents to other datasets [12]. However, none of the mentioned studies investigates an interlinking approach between an educational repository and several e-learning datasets on the Web of Data. Furthermore, they do not mention duplicate identification amongst the interlinking results. Following our previous studies [10, 12, 13], we extended our approach on 20 educational datasets on the Web and scrutinized the results to discover the duplicate resources among the educational datasets.

3. Experimental setting

The GLOBE repository [5], which includes around 1 million learning object metadata [14], was selected as our source dataset. To examine the GLOBE resources and for the sake of exposing them as RDF [15], we harvested around 830,000 learning metadata from this repository and imported them into a relational database to analyse its metadata effectively and select the best possible elements for interlinking. All harvested files were in XML format based on IEEE LOM schema [16]. As we will discuss later, the candidate elements of GLOBE metadata were selected and exposed as RDF. On the other hand, we collected a set of educational datasets on the Web of Data and prepared several queries to retrieve their available elements for interlinking. Finally, we carried out the interlinking between GLOBE and the selected educational dataset using an appropriate tool.

In a previous study [12], we evaluated several interlinking tools on the Web of Data and demonstrated that LIMES [17] is a promising tool in this context. In LIMES, a user specifies the endpoints of datasets, comparable entities and thresholds of acceptance of output. When a threshold is set to 0.98, for example, it means that two concepts are considered to be matched if their syntax similarity is more than 98%. The tool runs a number of matching techniques and reports the results to the user based upon the configuration and similarities between the two datasets. Figure 1 depicts the workflow we followed to perform the interlinking process, as we explained above. In brief, we used LIMES to interlink GLOBE to 20 datasets in the LOD cloud and analysed the results by writing a program. In the following subsections we will describe the analysis of the GLOBE metadata elements and target datasets.

Figure 1.

Workflow of proposed approach.

3.1. Source dataset

We categorized the data types applied by the GLOBE metadata in Table 1 along with some examples. From the interlinking point of view, most of the elements cannot be used in the interlinking as they include ‘Dates’, ‘Boolean’ or controlled vocabularies. Focusing on the elements usage by the GLOBE resources, we realized that more than 90% of resources applied local values (e.g. identifiers), controlled vocabularies (Lifecycle.Status) and language codes, while the title of learning objects (General.Title) was highly used (97%) and more than half of the GLOBE resources included Keyword (61%) and Classification elements (59%) in their metadata. It should be noted that the Coverage element was only used by 7% of resources. As we discussed in the previous study [10, 13], four metadata elements including title (‘General.Title’), coverage (‘General.Coverage’), keywords (‘General.Keyword’) and classification taxonomy (‘Classification.Taxon.Entry’) were identified as candidate elements for interlinking.

Table 1.

LOM elements data type and sample values

Data type	LOM element	Values examples
Boolean	Cost, Copyright	“‘Yes”’, “‘No”’
Numeric	Technical.Size, Requirement.MinimumVersion, Requirement.MaximumVersion	“‘15200”’, “‘1.0”’
Local value	Identifier.Catalog, Identifier.Entry, Lifecycle.Version, Contribute.Entity, Technical.Location, Technical.InstallationRemarks, Technical.Other Platform Requirements, TypicalAgeRange, Description	“‘http://localvalue.com/568545”’, “‘3.0”’, “‘No installation”’, “‘This is a learning object about agriculture”’
DateTimes	Contribute.Date, Duration, TypicalLearningTime	“‘10P45M”’
Codes	Language codes	“‘en”’, “‘en-US”’, “‘de”’
Controlled vocabularies	Structure, AggregationLevel, LifeCycle.Status, Contribute.Role, MetadataSchema, Technical.Format, Technical.Requirement.Type, Technical.Requirement.Name, InteractivityType, InteractivityLevel, Difficulty, SemanticDensity, LearningResourceType, Educational.Context, IntendedEndUserRole, Relation.Kind, ClassificationPurpose, TaxonPath.Source	“‘atomic”’, “‘JPG”’, “‘low”’, “‘author”’, “‘PDF”’

Given that the most prominent language of resources in GLOBE is English [14] and for the sake of manual evaluation of results by human experts, we selected those resources that provided the candidate elements in English language. Bearing this in mind, around 53% of GLOBE included English titles, while there were more than 1.6 million English keywords used by 38% of GLOBE (consider Figure 2). Regarding the other candidate elements, only 2% of GLOBE resources provided the Coverage element and 22% (around 176,000 resources) of taxonomy of learning objects were in English language. Having the selected elements for interlinking, we exposed the selected elements as RDF using a mapping service (D2RQ [18]) for mapping data to RDF and carrying out the interlinking afterwards.

Figure 2.

GLOBE elements in English language.

3.2. Target datasets

To find appropriate targets for interlinking, we investigated several educational datasets in the LOD cloud. From a technical perspective, both source and target datasets should include either a SPARQL endpoint or an RDF dump. At first glance, it is obvious that most of the educational datasets lack any specific endpoint or RDF dump. Examining the datasets’ endpoints illustrated that most targets were not accessible at the time of this research. Finally, we could collect 20 educational datasets who responded to the queries or included an RDF dump to download. Afterwards, we calculated the size of each dataset using SPARQL queries. Appendix 1 illustrates the size of datasets (in triples) along with their full name. Amongst candidate datasets, ‘Charles University of Prague’ with more than 93 million triples of publications, was the biggest dataset. ‘Key Information Sets’ (UNISTAT-KIS), which includes a set of information about full- or part-time undergraduate courses, was the second one found in this context, with more than 8 million triples.

3.3. Interlinking process

When running the LIMES tool, the output is a number of links in RDF (N-TRIPLE format) that connects source and target entities using the sameAs relationship. Appendix 2 illustrates a sample output generated by the tool that indicates that seven GLOBE resources were linked to 10 resources in the OpenUK dataset. As can be seen, four GLOBE resources were linked to more than one target resource, and three resources in OpenUK matched to more than one resource in GLOBE.

Figure 3 depicts the data model we followed to find similarities between two datasets. In this model, we showed each dataset along with its properties as entities and the similarities between datasets as relationships. Each dataset has a title, endpoint URI, size and other specifications as attributes. It may include many entities and have many similarities to other datasets. The similarity relationship may have different attributes itself for finding the related resources in two datasets. We will apply another workflow after the interlinking later in this paper.

Figure 3.

Similarity data model.

It should be highlighted that throughout this study we used JAVA programming language, because of the many advantages that this programming environment provides, including various libraries (e.g., JSON, SET, Jena) in both analysis and Linked Data contexts. Figure 4 depicts the procedure we followed to perform the following tasks in our study:

find the GLOBE resources linked to each target dataset;

discover the total number of resources in GLOBE linked to all targets.

Figure 4.

Workflow of finding linked resources in targets.

To address the first goal, we used a set, as a distinct list of elements, to remove the duplicates in both source and target datasets. In some of the outputs, we had to split the file into several small ones, as the size of file was more than 1 Gb (as it included a million records) and the program could not process them with the available hardware resources. To achieve the second goal, we used the same approach extended for all datasets. In particular, the program retrieved the GLOBE resources in each output and added them to a final set to calculate the total number of resources overall. Appendix 2 illustrates the final output of LIMES result after running it against GLOBE and 20 educational datasets.

4. Discussion and results

As mentioned earlier, we used LIMES as the interlinking tool, to connect the candidate metadata elements (Title, Keyword, Taxon and Coverage) to the LOD datasets. Here, we categorize the results in four sections and present an analysis for each one.

4.1. Interlinking GLOBE elements to educational datasets

Figure 5 illustrates the interlinking results between GLOBE resources and the selected datasets. Figure 6 also depicts the GLOBE resource distribution among the target datasets. Below we report each element analysis in detail.

Figure 5.

Interlinking results between GLOBE and target datasets based on four elements in GLOBE.

Figure 6.

GLOBE resources distributions among target datasets over each element.

4.1.1. Title element

Figure 5 (a) depicts the interlinking results for the top five datasets with high similarities to the ‘title’ element. The x-axis in the figure refers to the number of GLOBE resources matched to the target dataset, while the y-axis shows the number of resources in the target dataset. In particular, Yovisto, with around 9117 resources, followed by OCW (the Open Course Ware consortium) and University of Bristol, had the greatest similarity to GLOBE. There were also 4127 resources in GLOBE connected to 2560 learning objects in The Open University of the UK, with around 13,600 matched links overall (see Appendix 3).

In total, five datasets (Data.gov.uk, Forge project, Semantic ISVU, MoreLab and Vergata) did not have any similarity with GLOBE. Table 2 also shows two GLOBE resources connected to several resources in the target datasets in which, for example, one resource about ‘Nuclear Energy’ matched three different datasets (ASN, Bristol and Huddersfield) specified with their URIs.

Table 2.

A sample of GLOBE titles matched to several resources

Title in both datasets	Globe resource	Target URI	Dataset name
Nuclear Energy	http://www.globe-info.org/ont/lom2owl#108450	http://schools.nyc.gov/NR/rdonlyres/6C64098F-0C24-4B27-A22F-F542A2F97DA0/130926/TTS_G11_LiteracySSandScience_NuclearEnergy.pdf	ASN
		http://resrev.ilrt.bris.ac.uk/research-revealed-hub/publications/118933#pub	Bristol
		http://data.linkedu.eu/hud/book/118555	Huddersfield
Bibliography	http://www.globe-info.org/ont/lom2owl#178214	http://resrev.ilrt.bris.ac.uk/research-revealed-hub/publications/15140#pub	OpenUK
		http://data.uni-muenster.de/context/istg/allegro/6/210/T00244773	Muenster

Analysing the interlinking outputs, we found that only a small number of GLOBE resources (around 24,000 overall) matched the target datasets through the title element. This result indicates that finding similarities for large texts is difficult for interlinking tools, as the resource titles in GLOBE mostly (around 83% of all English titles) include at least two words (e.g. ‘Alternating Current Circuits’). Also, we realized that there were around 16,000 resources in GLOBE linked to at least two target datasets and 8260 resources linked to all of them. Figure 6(a) depicts the distribution of GLOBE resources among the target datasets (with more than 1% GLOBE distribution).

4.1.2. Keyword element

In the case of the Keyword element, the range of acceptance reported by LIMES was far larger than the title element (Section 4.1.1), as only one dataset (Semantic ISVU) did not have any similarity to any keywords. As mentioned earlier, there were more than 1.6 million English keywords in GLOBE ranging from science in education to environment literature. The large number of generated results for the Keyword element may refer to the fact that more than 50% of these keywords include exactly one term and 33% contain two words, which helps an interlinking tool to discover more similarities. As it can be seen in Figure 5(b), we observed that only one dataset (UNISTAT-KIS) had more than 6.7 million links to GLOBE (around 118,000 GLOBE metadata to 7166 resources in the target datasets).

Analysing the results, we realized that around 228,000 resources in GLOBE (74%) were matched to the target datasets and there was also a large amount of resources (almost 760,000) in common among all the results. Figure 6(b) shows the distribution of GLOBE resources among the target datasets in which Yovisto (an academic video search), the Open University of the UK and the University of Hudders field were the most referred datasets.

4.1.3. Taxon element

Most of the taxonomies of learning objects in GLOBE included terminologies in one or two words and referred to the classification of resources. In particular, around 60% of taxonomies in GLOBE contained only one word and almost 25% of them included two words ranging from science to historical concepts. As Figure 5(c) shows, around 99,000 resources in GLOBE were identified by LIMES and matched to more than 4000 resources in the UNISTAT-KIS dataset, followed by the University of Huddersfield (with 90,512 GLOBE resources) and the University of Bristol (with 77,420 GLOBE resources). Overall, only two datasets did not link to GLOBE and around 135,000 resources (76%) were connected to one or more datasets. Figure 6(c) also illustrates 13 datasets with more than 1% resource distribution in GLOBE, of which the UNISTAT-KIS dataset and the University of Huddersfield included the highest similarities to the GLOBE resources through the Taxon element.

4.1.4. Coverage element

As Figure 5(d) illustrates, eight datasets could link to GLOBE, but mostly with small numbers of results. Yovisto, as an exception, was connected to 13,000 GLOBE resources with 676 resources (with around 8 million links). There were also around 12,941 (78%) resources in GLOBE linked to all the target datasets (mostly to Yovisto and OCW). It should be highlighted that most of the matched terms referred to geographical places and countries. Figure 6(d) also shows that the references to Yovisto and OCW datasets were distributed throughout most of the GLOBE resources via the Coverage element.

4.2. Human evaluation of the interlinking results

As we discussed earlier, the interlinking tool reported a set of records as an output with similar values in the source and target datasets. For example, the term ‘Photosynthesis’ was the title of a learning object in the ASN dataset (http://www.pbslearningmedia.org/resource/tdc02.sci.life.stru.photosynth/photosynthesis/) and the GLOBE repository (http://ariadne.cs.kuleuven.be/finder/globe/?query=Photosynthesis). However, the question under discussion is to what extent these learning resources are semantically matched or related. To this aim, we reviewed the generated results to discover an appropriate target and evaluate the outputs manually. In the manual evaluation, we focused on duplicate and related resources, as we will discuss in the following sections.

4.2.1. Duplicate resources

As Figure 7 depicts and according to the data model proposed in Section 3.3, a workflow is presented for finding the duplicate resources. Having the interlinking results, we retrieved the other metadata elements including URI and description of learning objects from datasets after identifying their metadata schema (e.g. dcterms in Open UK). This task was carried out using the SPARQL queries. In the next step, we analysed the value of metadata elements. In particular, if the actual URIs of resources in both datasets point to the same internet address, they are proposed as duplicate resources. In the case of unavailability of URIs, the duplicate finding is focused on the descriptions of learning resources and analysing their values using the text-matching functions. If both resources have high similarities in their descriptions, they can also be presented as duplicates.

Figure 7.

Duplicate finding workflow.

4.2.2. Related resources

From a technical perspective, if no similarities exist among the other metadata elements, the evaluation is continued on exploring the actual address of the resource (URI) where the learning object exists. This helps a domain expert to identify the relatedness of two resources semantically by exploring the content. Moreover, the other metadata elements of linked resources (such as description or subject) might be different in syntax, but an expert identifies them as related resources conceptually owing to their content similarity. As an example in our case study, a course about ‘Latitude and Longitude’ in Text/HTML format in GLOBE linked to a resource in the target dataset, but in another format (sound recording), which means that a human expert can identify their relatedness as well.

4.2.3. Case study

As a consequence of evaluating the records and given that the human evaluation of links manually requires significant effort, we selected the results between GLOBE and the Open University of the UK (OpenUK) on the title element as our case study by taking the following notes into account:

Interlinking results usually include two identifiers which usually point to the internet addresses. Given that the resource identifiers were not implemented properly in some cases or they might be broken, we selected those that followed the good URIs [19].

Despite providing available URIs, the metadata schema should be rich enough so that an expert can compare the information in both targets. For example, the target metadata in some cases included only three elements (title, format and subject), which is insufficient for the evaluation.

Nevertheless, the learning object metadata in OpenUK had an acceptable quality, as most of the resources included an accessible URI and a well-formed schema with a clear description. Following the proposed approach presented in Section 4.5.1, we realized that none of them pointed to the same address on the Web, but the matching analysis showed that they referred to the same learning object published by different data providers. According to our analysis, 374 resources (out of 4127) in GLOBE were identified as duplicate resources with OpenUK. Turning to the related resources, we selected 300 records of the non-duplicate results and realized that around 246 (82%) of them were semantically related to each other. Also, 48 resources (16%) were not accessible as the URLs were broken or unreachable, and the rest (2%) were not semantically the same (false positive). Notably, we found two resources about ‘functions’ but in two different contexts and thus we could not categorize them as related.

To justify the proposed approach, we asked two experts to follow the workflow (Figure 7) and evaluate a set of resources that we had marked as duplicates. To this aim, we randomly selected 20 resources (out of 374) from the results and asked the experts to assess the resource URLs along with the other metadata elements by following the proposed instruction. On receiving the experts’ responses, we applied the kappa measure of agreement to analyse the agreement between the observers and gauge the reliability of the responses. The maximum value of kappa is 1, which represents perfect agreement, and kappa will take the value 0 if there is only chance agreement. We later imported the experts’ input to SPSS¹⁹ and analysed the measure of agreement of results. The output of the software was valid and equal to 0.828, which demonstrates that the experts strongly agreed on the results. A closer look at the responses given by the raters indicates that most of the resources were marked as duplicates (rater 1 with 17 and rater 2 with 16 resources). The experts could not also judge the rest of resources owing to insufficient information in their metadata.

5. Conclusions

The purpose of this research was to evaluate the results of interlinking between a large learning dataset (GLOBE) to educational datasets in the Web of Data. After analysing the GLOBE metadata and selecting appropriate elements for interlinking, we applied a tool to interlink GLOBE to 20 educational datasets on the Web and evaluated the generated results. In conclusion, we outline the implications of this study as follows:

Interlinking a learning repository to several educational datasets in the LOD cloud leads to the enrichment of content, as this approach links one e-learning resource to several other resources in different datasets on the Web.

Evaluating the results of interlinking among the candidate elements of GLOBE demonstrates that the semantic accuracy of matched links for the Title element was higher than the Keyword and Taxon elements, although the distribution of GLOBE resources for this element was minor. Furthermore, the high percentage of GLOBE contribution in a few datasets for the Coverage element indicates that connection of this element to geographical datasets like Geonames or Factbook is appropriate. However, around 93% of GLOBE resources did not provide this element in the metadata.

Apart from resource enrichment, one of the other benefits of an interlinking process is duplicate identification. Our examination on a set of resources illustrates that several resources are published by different data providers and point to different internet addresses on the Web, although they refer to the same learning resource. We carried out this identification by proposing a data model along with a workflow in which we compared the other metadata elements retrieved from both targets after performing the interlinking process.

Footnotes

Appendix

Appendix 3.

Interlinking results between GLOBE and the selected educational datasets.

Dataset	Title	Keyword	Taxon	Coverage
UNISTAT-KIS	188	6,788,988	5,692,741	0
Yovisto - – academic video search	68,506	1,813,416	1,263,662	7,995,334
University of Bristol	17,858	1,872,875	657,686	733
University of Huddersfield	137	828,725	361,791	78
Open University in UK	13,644	720,023	290,837	4
Open Courseware Consortium metadata	24,657	169,737	77,493	24,933
Data.gov.uk	30	100,950	45,604	0
University of Muenster (LODUM)	333	41,522	30,148	316
Open Data @ Tor Vergata	0	61,993	28,444	0
Charles University in Prague	151	72,162	28,157	0
Achievement Standards Network (ASN)	80	131,396	16,481	0
TheSoz Thesaurus for the Social Sciences	138	36,121	13,981	65
Aalto University	14	17,110	3,843	463
Vytautas Magnus University, Kaunas	44	5,201	938	0
OxPoints (University of Oxford)	133	30,512	881	0
PROD	3	7,887	533	0
MoreLab	0	740	42	0
University of Southampton	9	55	0	0
Forge project	0	34	0	0
SUM	125,925	12,699,447	8,513,262	8,021,926
Unique resources in GLOBE	8,260	228,352	134,791	12,941
Common resources in GLOBE	16,354	760,830	413,520	13,591

Funding

The work presented in this paper has been part-funded by the European Commission under the ICT Policy Support Programme CIP-ICT-PSP.2011.2.4-e-learning with project no. 297229 ‘Open Discovery Space (ODS)’ and INFRA-2011-1.2.2-Data infrastructures for e-Science with project no. 283770 ‘AGINFRA’.

References

Bizer

Heath

Berners-Lee

. Linked data – The story so far. International Journal of Semantic Web Information System 2009; 5(3): 1–22.

Piedra

Chicaiza

López

Tovar

. An architecture based on linked data technologies for the integration and reuse of OER in MOOCs Context. Open Praxis 2014; 6(2): 171–187.

Dietze

Sanchez-Alonso

Ebner

Giordano

Marenzi

Nunes

. P. Interlinking educational resources and the web of data: A survey of challenges and approaches. Program: Electronic Library and Information Systems 2013; 47(1): 60–91.

Scharffe

Euzenat

. MeLinDa: An interlinking framework for the web of data. CoRR abs/1107.4502, 2011.

GLOBE (Connecting the World and Unlocking the Deep Web),http://globe-info.org/ (accessed 12 December 2014).

Volz

Bizer

Gaedke

Kobilarov

. Silk – A link discovery framework for the web of Data. LDOW, 2009.

Simperl

Wölger

Thaler

Norton

Bürger

. Combining human and computation intelligence: The case of data interlinking tools. International Journal of Metadata Semantics and Ontology 2012; 7(2): 77–92.

LinkedUp Project, Linking Web data for education, http://linkedup-project.eu/ (accessed: 12 December 2014).

Dietze

Giordano

Kaldoudi

Dovrolis

Taibi

. Linked education: Interlinking educational resources and the Web of data. In: Proceedings of the 27th annual ACM symposium on applied computing, New York, 2012. pp. 366–371.

10.

Rajabi

Sicilia

M-A

Sanchez-Alonso

. Interlinking educational data: An experiment with GLOBE resources. In: First international conference on technological ecosystem for enhancing multiculturality (TEEM), 2013, pp. 365–374.

11.

Piedra

Tovar

Colomo-Palacios

Lopez-Vargas

Chicaiza

. Consuming and producing linked open data: The case of OpenCourseWare. Program: Electronic Library and Information Systems 2014; 48(1): 16–40.

12.

Rajabi

Sicilia

Sanchez-Alonso

. An empirical study on the evaluation of interlinking tools on the Web of Data. Journal of Information Science 2014; 40: 637–648.

13.

Rajabi

Sicilia

Sanchez-Alonso

. Interlinking educational resources to web of data through IEEE LOM. Computer Science and Information Systems 2015, in press.

14.

Ochoa

Klerkx

Vandeputte

Duval

. On the use of learning object metadata: The GLOBE experience. In: Proceedings of the 6th European conference on technology enhanced learning: Towards ubiquitous learning, Berlin, 2011, pp. 271–284.

15.

Klyne

Carroll

. Resource Description Framework (RDF): Concepts and abstract syntax, W3C Recommendation, 2004.

16.

IEEE LTSC. IEEE standard for learning object metadata 1484.12.1-2002, final draft version, http://grouper.ieee.org/groups/ltsc/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf (accessed 12 December 2014).

17.

Ngonga

Sören

. LIMES – A time-efficient approach for large-scale link discovery on the web of data. Presented at the International Joint Conference on Artificial Intelligence, 2011.

18.

Bizer

. D2RQ − Treating non-RDF databases as virtual RDF graphs. In: Proceedings of the 3rd international Semantic Web conference (ISWC), 2004.

19.

Cool URIs for the Semantic Web, http://www.w3.org/TR/cooluris/ (accessed 12 December 2014).