Human Language Technology Group

  language word  
  The project
  On-line interface
  Related work
  Data format
  Download brochure
  Become a contributor
The MultiWordNet project

  • The approach
  • The semi-automatic creation of MultiWordNet
  • What's in MultiWordNet
  • How MultiWordNet is used in Natural Language Processing applications

    The approach

    The MultiWordNet project aims at the realisation of a large scale multilingual computational lexicon based on WordNet.

    WordNet is a lexical database, created at Princeton University, in which nouns, verbs, adjectives and adverbs are organized into sets of synonyms (synsets), representing lexical concepts. Synsets are linked by means of various relations, both semantic and lexical. Semantic relations, e.g. hypo/hypernymy and meronymy, hold between synsets, while lexical relations, e.g. antonymy, connect words.

    The model adopted within the MultiWordNet project stresses the usefulness of a strict alignment between lexical databases, i.e. wordnets, of different languages, while retaining the ability to represent true lexical idiosyncrasies between languages. It consists of building language specific wordnets keeping as much as possible of the semantic relations available in the Princeton WordNet (PWN). This is done by building the new synsets in correspondence with the PWN synsets, whenever possible, and importing semantic relations from the corresponding English synsets; i.e., we assume that if there are two synsets in PWN and a relation holding between them, the same relation holds between the corresponding synsets in the new language.

    A possible risk related to the MultiWordNet approach is that of forcing the new wordnets to depend on the lexical and conceptual structures of the English language. However, this risk can be avoided by allowing the new wordnet to diverge, when necessary, from PWN.

    Two major idiosyncrasies can occur: lexical gaps (a language expresses through a lexical unit what the other language expresses with a free combination of words) and denotation differences (a translation equivalent exists in the target language but it is more general or more specific). In both cases, a lexical concept of one language has no synonymous correspondent in the other language. These cases are dealt with in the MultiWordNet architecture by creating special empty nodes whenever the lexical concept of one language has no correspondent in the other.

  • MultiWordNet ® - All rights reserved.      918593 visitors (since 26-Jul-2004) maintainer Girardi C. :