The main aim of the project ROTEL is to design, implement and evaluate an intelligent, content-based, platform for assisting the knowledge engineer in Semantic Web application construction, for generations of documents complying with the Semantic Web requirements and allowing for using the Romanian language in the multilingual Web-environment.



       Main objectives 

  • intelligent information extraction from Web sources (structured, semi-structured, or even plain text)

  • semantic integration and querying of disparate Web sources (which have not been developed with this interoperability requirement in mind)

  • using ontologies and rules for automated reasoning about the contents of the sources.

  • using and improving existing standards and technologies.

  • building of language resources (LRs) that will facilitate effective content-based processing of documents. The language specific resources and tools will be developed according to international best practices, for a easy integration into a standardized multilingual processing environment;

  • develop content-based classification of documents, according to domain specific ontologies and generation for each classified document of metadata in conformance with the latest semantic web standards;

  • develop sophisticated services for natural language (Romanian) question-answering with respect to the content of the documents, etc.



     Main features of ROTEL system


Since building a complete Semantic Web application with current technology is extremely time-consuming and costly, the current project aims at constructing an intelligent environment for developing Semantic Web applications. Such an environment will focus on using reasoning mechanisms for processing domain descriptions and will employ machine learning techniques for simplifying most aspects of SW application development. The system will contain components for the design of SW applications, as well as an intelligent query answering component able to use domain knowledge for answering complex semantic queries.

The project aims at dealing with the main SW problems in a complete setting. The ROTEL project will demonstrate the advantages of combining domain knowledge (represented as formal ontologies following the principles supported by the Semantic Web Services Language Committee) and linguistic knowledge (represented through lexical ontologies and various language models). This demonstration will be based on Semantic Information Retrieval (SIR) and Knowledge Extraction from documents (KE) .

Aspects related to application design

  • creating and updating lexical and domain ontologies

  • specifying rules for describing the operational semantics of the ontology

  • developing a system to generate semantic annotations for raw texts

  • developing a term extraction and a thematic classification system according to a close set (but extendible) of domains supported by local ontologies

  • developing an aligning system to disambiguate terms in a text and map them onto the elements of the supporting ontologies

  • developing a graphical interface for assisting the knowledge engineer to check and correct the terms alignments

  • a graphical interface for assisting the knowledge engineer in the process of describing the mapping rules between local schemas of the sources and the domain ontology

  • a graphical interface to assist the knowledge engineer in the process of wrapper development

  • using annotation tools for HTML pages and Web services

  • using machine learning techniques for inducing knowledge about the schemas of the sources as well as about their content (types and domains, cardinality restrictions, statistics related to number of “records” and access times, semantic knowledge about the sources, their links, source overlaps, source completeness)

Intelligent query component

  • a mediator architecture using query planning for answering semantic queries
  • a query expansion module which would exploit the lexical semantic relations in the lexical ontologies (synonyms, meronyms, hyponyms or hyperonyms)
  • using knowledge about the contents and so-called “capabilities” of the sources for making query planning more efficient
  • treating the problem of Web service compositions within the framework of query planning
  • a graphical interface to assist the user in constructing queries
  • a natural language interface allowing the user to get answers to specific questions, expressed in Romanian, related to the documents in a specific thematic domain
  • a document summarisation system, providing abstracts of controlled length for the relevant documents
  • presenting the query results in a “browsable” format.

Home | Consortium | Schedule | Publications | Search | Discussions | Contacts

 Copyright ICI(R).
Last updated: 01/30/07.