Digital Libraries and Educational Resources: the AquaRing Semantic Approach

Large amounts of scientific digital contents are nowadays held by scientific institutions which collect, produce and store information valuable for dissemination, work, study and research. In this context, the development of the web and of learning technologies has brought new opportunities for teachers and learners to retrieve and share pedagogical objects. This paper introduces the use of a semantic approach developed within the EC funded AquaRing project with the aim of improving access to the vast amount of digital content concerning the aquatic environment and its resources, as well as supporting enhanced education and informal learning in this specific domain. In order to achieve these goals, a semantic framework and an educational ontology were developed and implemented. Both were used to support the indexing of learning resources and to provide several educational services to endusers (especially children, students, parents and teachers).


INTRODUCTION
The evolution of the World Wide Web makes large amounts of resources available for public sharing and reuse. It seems to be able to realize the dream of the Library of Alexandria: to collect the entire world knowledge. This is a particularly interesting opportunity for scientific organizations which institutionally collect, store and produce contents valuable for work, study and research in many different fields. In this context, the development of the web and of learning technologies has brought new possibilities for teachers and learners to retrieve and share pedagogical objects. As a matter of fact, the number of educational contents which is nowadays available online is rapidly growing, but some problems emerge as a result of this proliferation of materials, such as their increasingly difficult management and accessibility. Keyword-based search engines are the main tools for content retrieval today, but there are some difficulties associated with their use [1], such as low precision (a lot of irrelevant documents are retrieved); low or no recall (relevant pages are not retrieved); results highly sensitive to vocabulary (relevant documents use different terminology from the original query); results as single web pages (if we need information that is spread over various documents, we must manually extract partial data from single web pages).
As a consequence, the application of Information and Communication Technology (ICT) in learning contexts requires new models to support the process of content management, based on environments and tools enabling users to build, represent and share their knowledge. In addition, another key point in the development of innovative learning environments is in providing flexibility and personalization of contents and services [2].
Semantic technologies [3] can support both developers and users in achieving such goals. By formalizing how raw data relate to real concepts, semantic web tries to overcome barriers to integrate heterogeneous yet related resources and offer value-added services to audiences living in different countries, speaking different languages, using different vocabularies and having different interests. Expressing information in a machine-interpretable form is expected to revolutionize scientific publishing and data sharing on the Internet.
There are several knowledge representation models, technologies and XML-based languages that allow description of resources in a standardized way, enhancing information reusability and interoperability. Ontologies and semantic mark-up, which represent the core of the network of knowledge on the semantic web, were adopted in the context of the EC-funded AquaRing project (eCon-tentPlus Programme) whose aims are to improve access to the vast amount of digital contents concerning the aquatic environment and its resources, as well as to support enhanced education and informal learning in this specific domain. In order to achieve these goals a semantic web based framework was designed and implemented, and an educational ontology was developed.
The paper is structured into the following sections: an overview of the AquaRing project with a description of its educational target and objectives; an introduction of the ontology-based approach adopted and services supported; and, finally, a section on the lessons learnt and future works.
II. THE AQUARING PROJECT AquaRing (an acronym for "Accessible and Qualified Use of Available digital Resources about the aquatic world In National Gatherings") [4] is a project launched in 2006 and funded by the European Commission's eContentplus programme, a multiannual Community programme whose overall aim is to make digital content in Europe more accessible, usable and exploitable. It defines target areas which have a public interest and which would not develop or would develop at a slower pace if left to the market, including educational content and digital libraries.
The AquaRing project, which initially gained the participation of six countries (Belgium, France, Italy, Lithuania, Netherlands and Spain), addresses the large cultural heritage of knowledge available from European aquaria, science centres and natural history museums on the do-  to bring together existing online collections on marine and aquatic sciences (showing the rich contribution that Europe brought in this area);  to improve what science centres can offer, with a greater collection of digital objects which allows museum visitors to learn online as well as in person;  to provide the most comprehensive online digital library for research and education in marine and aquatic sciences;  to support improved education and informal learning experiences for individual learners and groups;  to raise awareness about aquatic environments and how they can be conserved.
In order to achieve these goals, an international Consortium was established by five scientific partners (the Genoa Aquarium, coordinator of the project; the Museum of Natural Sciences of Bruxelles; the Nausicaa National Sea Centre of Boulogne-sur-Mer; the Rotterdam Zoo; the Lithuanian Museum of the Sea) whose digital documents provided a starting point for the content of the AquaRing portal; two technological partners (Softeco Sismat and Tecnalia Robotiker) designated to design the semantic web technology together with the infrastructure to support it; an academic partner (DISA University of Genoa) in charge of the project evaluation; and the European network of science centres and museums (ECSITE) to take care of the dissemination. At all stages of the project development process there was a very close collaboration among scientific and technological partners.
In order to offer personalized services (visitors are able to explore the AquaRing portal according to their own interests and needs by means of different accesses for various target audiences with a multilingual interface), a preliminary analysis of user needs was performed. Consistently with the analysis results, the Consortium identified the following main targets:  general individual visitors: private individuals (from 13 years old on) with general or more specific interests about the aquatic world;  teachers: this group, which was indicated as a priority target by many of the project partners, includes teachers of primary and secondary schools of first and second grade who teach sciences, chemistry, geography, biology and microbiology to young students;  sea museums, historical science museums, science centers, aquaria, and zoos: this group includes professionals who would like to use the portal for obtaining specific and technical information to improve their structure's functionality, to offer their visitors more dedicated services, and to organize events, exhibits, European projects, etc.;  children: it includes young users under 13 years old on;  media: it includes journalists and professionals who work on television, radio and press agencies. After the target definition, the needs of each user category were identified in terms of document types, relevant topics, aims, services and graphic interface. With refer-ence to the learning purposes of AquaRing, we will focus our attention on results related to the teachers and children categories. According to the survey, the main objectives and needs for the teachers are the following: to find instructional resources for their lessons and pre-constructed learning paths in order to prepare their courses about aquatic environments; to get knowledge about activities and projects developed by European cultural institutions; to give their opinions and discuss with both colleagues and experts about different topics related to the AquaRing domain; and to organize sightseeing for their students. The objectives and related needs for the students are the following: to find digital resources usable for writing dissertations and researches related to environmental issues and sea world; to have information about their future job (professional training, maritime transport, exploitation of biological resources, etc.); and to retrieve advanced scientific information in order to prepare exams [5].
Regarding the type of digital resources, teachers declared to be particularly interested in documents (like papers, essays, project reports and conference proceedings) and multimedia files (such as images and videos) which allow multi-sensorial exploitation and, at the same time, an easy learning even for young primary school students. With regard to children's' interests, it emerged from the analysis that it is very important for them to find resources that are simple and easy to read and understand, but most of all that is interesting and possibly fun. They would be very happy to look at photos of the animals that most fascinate them and to hear the sounds that the animals make and the ones that characterize their environment. Moreover, when children visit a facility, they like to know what happens "behind the scenes" (for example, in case of an aquarium, they would like to know how things works, how the fish are fed, what they eat, etc.) [5].
To satisfy not only teachers', but also students' and parents' needs, an educational area was designed including digital resources that are intended mainly for teaching purposes, such as bibliographies, drawing books, educational games, exercises, glossaries, lectures, lesson plans, simulations, and so on. These resources were annotated and made browseable in a semantic way by means of an educational ontology specifically designed for AquaRing's goals.
After its launch in March 2009, the AquaRing portal allows access to information on the aquatic world in the form of images, videos, audio files, interactive software, digital collections, articles, theses, papers and dissertations. Parents of young children in particular can find lots of pictures, videos and activities to satisfy their youngsters' curiosity. The resources also lead teachers and lecturers to much academic material which will help them to plan their lessons and learning paths.

III. THE AQUARING ONTOLOGY-BASED APPROACH
In order to support the development of end-users services, a semantic framework was designed and implemented to ease the aggregation, management, accessibility, sharing and reuse of distributed digital collections by means of semantic content annotation. Since eContentPlus is a non-research programme, state-of-the-art technical solutions and standards were adopted wherever possible to allow focusing on methodology and development of value-added services rather than on technological innovation. http:www.i-jet.org

DIGITAL LIBRARIES AND EDUCATIONAL RESOURCES: THE AQUARING SEMANTIC APPROACH
The semantic framework is based on a simple layered architecture composed by the following layers [6]:  the data access layer: the infrastructure where contents and data are stored;  the semantic layer (including metadata model and ontology-based knowledge model): it contains formal representations of the different contents available in the individual collections and provides a semantic access to and processing of the distributed data collection;  the service layer: it provides advanced knowledge searching and management mechanisms needed to respond to user queries;  the interaction layer: the AquaRing portal of end-user services.
The AquaRing semantic layer is based on some main controlled vocabularies and ontologies dealing with aquatic environments and resources. Vocabularies and ontologies are adopted as commonly recognized linked taxonomies for proper semantic annotation of contents, in order to support a tag-based classification of all available resources. As for the semantic layer, Resource Description Framework (RDF)-based Qualified Dublin Core Metadata Element Set (QDCMES) [7] was adopted to formalize content annotation and metadata scheme whereas Web Ontology Language (OWL) [8] was selected to express the domain ontologies. The Dublin Core Metadata Element Set (DCMES) is a de-facto reference standard vocabulary of fifteen elements for generic resource description (the AquaRing metadata model adopted its refined Qualified version which includes some additional descriptive fields). To annotate a resource in the AquaRing system it is necessary to fill in a metadata record that describes the resource, in its original format and language. Tagging is then allowed by selecting concepts from the different domain ontologies or by specifying missing ontology concepts (free tags). The framework thus applies a mixed classification scheme, enhanced with formal/informal annotation: disjoint specialized ontologies are used for semantic annotation, whereas hierarchical topic-driven free tagging is allowed to fill in coverage gaps. Furthermore, the controlled free tag mechanism allows to support an ontology learning process conceived for merging and integrating ontologies by creating relationships between original and added terms. The result of the ontology learning process can be checked and corrected by domain experts using an ad hoc ontology editor which provides translations editing, relationships pruning and hierarchical free tags editing [9].
As no appropriate ontology was available to cover the educational field of the AquaRing knowledge domain, an ontology has been expressly designed in order to index pedagogical resources and to organize and retrieve learning materials in a more efficient and meaningful way.
In the philosophical sense, the term ontology is used to represent the study of being or existence [10], but the term was adopted by the computer science community to offer a shared and common understanding of some domain which can be communicated across people and application systems [11]. OWL is the language developed by W3C for representing ontologies on the web in an XML-based syntax. OWL ontologies consist of individuals (objects in the domain that we are interested in), properties (binary rela-tions on individuals) and classes (sets that contain individuals) [12].
The development process of the AquaRing educational ontology was based on the following steps: the context analysis; the vocabulary definition; the terms classification; and, finally, the definition of classes and properties [2]. Several reference works on indexing of learning objects were carefully analyzed in order to define a first draft vocabulary: the IEEE Standard for Learning Object Metadata [13]; the Dublin Core Education Application Profile [14]; the Dublin Core Metadata Element Set [7]; the EUN Learning Resource Exchange Metadata Application Profile [15]; and the POEM Pedagogy Oriented Educational Model [16]. Then, after a comparison of the draft vocabulary with the results of previous context analysis, a preliminary investigation on educational resources owned by content providers was carried out.
Subsequently, thanks to the constant exchange of ideas with learning resource experts, the terms were organized into a primitive taxonomy, using only hierarchical relationships. The first ontological model was based on two main categories, Audience and Educational features, the former intended to describe the hypothetical learning resource users and the latter to define the key educational characteristics of learning contents. Finally, after a new brainstorming session with project partners, a new ontological model was designed, identifying key concepts, sub-concepts and their relationships. The resulting ontology was represented in OWL Description Logics (OWL DL, a sub-language of OWL), using Stanford University Protégé Ontology Editor (a free software that provides a suite of tools to construct domain models and knowledgebased applications with ontologies) [17].
It is based on five classes: context (the principal environment within which the learning and use of the learning resource is intended to take place); objective (the cognitive learning outcomes based on Bloom's taxonomy) [18]; resource (the learning resource); resource feature (the key educational characteristics of learning resource); and user (main user for which the learning resource is designed). The class resource feature has five subclasses: difficulty level; fruition mode; fruition time; interactivity mode; and type. This latter represents the specific kind of educational resource (more than forty learning resource types were identified, such as animated cartoon; best practice; bibliography; comics; concept map; course; drawing book; educational game; lesson plan; paper; role play; tutorial; web quest, and so on). The class user has two subclasses: learner (one who works with a learning resource in order to learn something) and mediator (one that mediates access to the resource and for whom the resource is intended or useful, such as a teacher, a tutor, an instructional designer, or a parent) [2]. Finally, we created a glossary and added it to the ontology in order to give an instant explanation of each term included in the vocabulary, thus easing its use during the resources annotation task and making the model self-explicit. At the same time, the ontology was translated from English to Dutch, French, Italian, Lithuanian, and Spanish.
The educational ontology was designed and is now used to enrich the description of the contents included in the AquaRing knowledge base and to provide the following educational oriented services to the users:  the semantic search engine: it is the core functionality for retrieving pedagogical resources within the iJET -Volume 5, Issue 1, March 2010 AquaRing portal, it provides users with domainrelated suggestion to improve their search and refines the search algorithms to obtain more meaningful results;  the semantic content browser: it allows the user to navigate through the ontologies and includes concepts getting immediate feedback on how concepts are related and on how many contents are available for a specific topic;  the educational tag cloud: it provides a visual feedback on instructional contents available for specific topics, according to the principle of "the larger the amount of related contents available, the bigger the font";  the learning path viewer: it allows to visualize groups of digital resources in the form of a conceptual exhibition (the content providers can gather some of the available contents following a specific thematic and create a learning path which guides users in a virtual tour with additional descriptions which allows to further increase his experience).
As previously said, whereas other ontologies adopted refer directly to what content is about (the aquatic domain), the educational ontology refers to educational purposes of resources. From this point of view, it can support content-type oriented search: a teacher searching for information on e.g. "jellyfish" can in fact restrict the search algorithms by specifying that only Resources for Pri-maryEducation students with an Objective focused on plain Knowledge and with a VeryEasy difficulty level must be extracted from the knowledge base.
Regarding the technological features of the framework, the first releases were built on top of Sesame (RDF Schema Querying and Storage) [19] with support for RDF Schema inferencing and querying [20]. Performance problems and Sesame's limited SPARQL (RDF Query Language) [21] support forced to port the core business logic interacting with the metadata and ontology repository on top of Jena [22] API libraries by Hewlett-Packard Labs TM . As for the interaction and service layers, state-of-the-art Java-based web technologies (Java Server Pages/Servlet, Java Server Faces) have been used for information presentation, including AJAX (Asynchronous JavaScript and XML) and DHTML (Dynamic HTML) to support advanced user interface in content retrieval and navigation.
The framework manages all domain ontologies adopted for content annotation through Jena API and stores metadata in RDF format compliant with QDCES, supporting multilingual annotation by means of a simple coupling mechanism between metadata instances and physical contents. Contents described by metadata are remotely stored on a distributed virtual content space (data access layer) including several independent HTTP (Hypertext Transfer Protocol) servers hosting dedicated file systems easily managed through FTP (File Transfer Protocol) facilities.

IV. LESSONS LEARNT AND FUTURE WORKS
In this paper, we have introduced AquaRing, an ECfunded project concerning aquatic environments and their resources, and the semantic approach used to support learning contents annotation and retrieval, and to provide some educational services.
At this time, we are working on the improvement of the AquaRing web portal and services that were launched in a beta version by March 2009. During the first year of the "project life", we will perform an analysis to evaluate the educational efficacy of the approach adopted (both in term of pedagogical impact of semantic navigation and retrieval functionalities). According to the data collected so far by means of questionnaires, semantics is perceived as an innovative approach to efficiently support user's experience by allowing refining searches and navigations and suggesting alternative domain-related learning paths. The encouraging results obtained in this application scenario suggest that the overall approach can be easily applied in different cases, offering interesting exploitation possibilities by means of minor contextual adaptation. In fact, it is worth noticing how the design of the educational ontology is almost seamlessly applicable to other scenarios, as it does not contain any specific restriction on the type of resources that can be described using the concepts specified herein. In the case of the AquaRing project, we have used the ontology to describe generic educational contents used in the context of exhibitions on aquatic topics, but it might have been adopted as well to describe multimedia objects coping with other scientific domains. An interesting issue for further research activities, currently under investigation, is the use of this ontology to provide other educational oriented services to the users. An interesting example could consider the semi-automatic creation of learning paths inside the annotated resources following educational and scientific annotations.
With reference to lessons learnt in pursuance of the project, we would like to focus our attention on the following issues: the design of digital libraries for children; the annotation process of educational resources in scientific contexts; and, finally, the multilingual and multicultural problems related to the use of ontologies.
With regards to the first issue, we would like to remember, according to the results of the EU Kids Online project [23], that the number of children and young people that use Internet is rapidly growing: 50% of children (under eighteen years old) have used the internet, rising from just 9% of those under six to one in three six-seven year olds, one in two eight-nine year olds and more than four in five teenagers aged twelve-seventeen. Despite that, the vast majority of digital libraries services and interfaces are targeted at adults or older students.
Actually, the AquaRing project is not intended mainly for children and this is the reason why its services and interface were designed for a heterogeneous group of users, composed primarily by adults. However, it is clear that, with a view to future projects more oriented to young users and in order to assure the usability of digital libraries for children, the educational services and the web interface should be consistent with their cognitive strategies and motor skills. Hutchinson et al. highlighted how many searching and browsing environments suffer from some problems, because they do not take into account the following elements [24]:  the information processing and motor skills of children (specifically their difficulties using a mouse);  children's searching and browsing skills (specifically their troubles with spelling, typing, navigating, and composing keyword queries);  children's preference about search-and-retrieve criteria.
The proposed solutions, emerging from literature available on this topic, are based on the engagement of young users as design partners; the use of ontologies more consistent with children's vocabulary; the use of a graphic interface to organize contents and to offer searching and browsing facilities; and, finally, the development of content organization model more coherent with young users' representation schemes.
Referring to the second issue, the annotation of educational resources in the context of AquaRing was performed by a group of annotators, experts in the aquatic domain, selected by scientific providers. However, this task requires a complex body of knowledge related to semantic indexing, scientific competence, and the pedagogical domain. Because of project constraints derived from AquaRing planning, it was not possible to train a group of annotators with all these competencies. For this reason, a glossary which explains all the educational ontology terms was written and distributed to the scientific content providers and a supporting service was offered to help annotators in the complex task of learning resources description.
Finally, with reference to the third issue, we would want to remark the difficulties related to the use of controlled vocabularies to index resources in the context of multilingual and multicultural projects. The educational ontology, as previously said, was translated into the different languages of the countries involved in order to offer a multilingual interface to users. However, based on our experience, a simple translation of the terms into the different languages is not enough to respect the semantics of the ontology, and a multicultural analysis of the vocabulary is essential. As an example, the class Context includes individuals like Pre School, Primary Education, First Grade Secondary Education, Second Grade Secondary Education, Higher Education, Lifelong Education, Vocational Training, and so on. It stands to reason that a literal translation of these terms might not to be suitable to represent the different European educational systems, and therefore a vocabulary contextualization appears to be necessary in order to reflect multicultural differences.