STITCH

STITCH @ CATCH

Semantic Interoperability To access Cultural Heritage

NWO MPI VU KB

You can find a short depiction of the goals of STITCH in this poster!

Project description

Cultural-heritage collections are typically indexed with metadata derived from a range of different vocabularies, such as AAT, Iconclass and in-house standards. This presents a problem when one wants to use multiple collections in an interoperable way. In general, it is unrealistic to assume unification of vocabularies. Vocabularies have been developed in many sub-domains, each with their own emphasis and scope. Still, there is significant overlap between the vocabularies used for indexing. The prime research objective of this subproject is to develop theory, methods and tools for allowing metadata interoperability through semantic links between the vocabularies. This research challenge is similar to what is called the “ontology mapping” problem in Semantic Web research.

The overall objective can be divided into three research questions:

  1. What kind of semantic links can be identified?
  2. Which methods and tools can support manual and semi-automatic identification of semantic links between vocabularies?
  3. How can such semantic links be employed to enable interoperable access to multiple collections indexed with heterogeneous vocabularies?

Scientific approach and methodology

The project will be application oriented. The goal will be to develop methods and tools that can be shown to work for relevant use cases.

We acknowledge the fact that progress has been made regarding syntactic interoperability, and that some initiatives have produced tools to access several collections and vocabularies at the same time. However, these solutions often underestimate the semantic interoperability problems, and miss proper linking mechanisms to the different systems of references of the orginal collections (thesauri, metadata schemes).

To solve these problems, the project will first investigate ways to represent metadata and vocabularies in RDF/OWL formats, which allow the building of resources linked at a semantic level. Even if important methodological contributions have been proposed, for example in the context of the SKOS (Simple Knowledge Organization Scheme) framework, this conversion step has to be further tested in realistic interoperability cases.

To create the semantic links between the different resources, the project will turn to the existing research work in ontology mapping. Several authors have proposed mapping relations for use in semantic linking. These include equality, equivalence, subclass, instance and domain specific relations. The project will use such proposals as a starting point and evaluate and extend/revise this set of mapping relations. Research on identification of links will first focus on baseline methods for manual specification of links such as developed within the MACS project. This will be supplemented with techniques from ontology learning targeted at finding such links automatically. The state-of-the-art techniques are not full proof (cf. alignment evalutation campaigns), so some form of human validation of the links will need to take place. This is not a big hurdle, as semantic links between vocabularies are a one-time thing. Another technique to consider is the generalization of existing annotations to semantic vocabulary links. For example, if according to a particular annotation the artist of a particular painting belongs to a certain art school, we may hypothesize that this link also exists for other works of the same artist. 

With respect to the use of semantic links we will identify a number of typical use cases that should be handled by the tools being developed. Some prototypical use cases are:

These use cases typically require the combination of information from different collection databases.

Example collection databases, vocabularies and thesauri

The following collection databases will be considered for application within the project:

Vocabularies and thesauri that are of potential interest here include:

Scientific relevance

Ontology mapping is becoming an increasingly important research topic. It may provide the background knowledge required for accessing distributed information repositories, both within (large) companies and on the Word Wide Web. Until now, much of the research effort has been spent on making syntactic interoperability feasible, i.e. to represent data models and data in a common (exchange) format. With the advent of XML, and RDF/OWL, these syntactic problems are now (at least in theory) solvable, but this potential is still largely unexplored. Given the fact that semantic interoperability has not been studied very much yet, this project has taken a use case driven approach. We expect to show that this technology can be employed to answer a new class of searches over different collections.

The STITCH project is part of the NWO CATCH program. See this document for details regarding CATCH.