MultiWordNet is a multilingual lexical database including information about English and Italian words. It is an extension of WordNet 1.6, a lexical database for English developed at the Princeton University. MultiWordNet contains information about the following aspects of the English and Italian lexical:
- lexical relations between words
- semantic relations between lexical concepts
- correspondences between Italian and English lexical concepts
- semantic fields
The basic lexical relationship in MultiWordNet is synonymy. Groups of synonyms are used to identify lexical concepts, which are also called synsets. Synsets are the most important unit in MultiWordNet. A lot of interesting information is attached to them, such as semantic fields and semantic relationships.
MultiWordNet can be used for a variety of NLP tasks including:
- Information Retrieval: synonymy relations are used for query expansion to improve the recall of IR; cross language correspondences between Italian and English synsets are used for Cross Language Information Retrieval.
- Semantic tagging: MultiWordNet constitutes a large coverage sense inventory which is the basis for semantic tagging, i.e. texts are tagged with synset identifiers.
- Disambiguation: Semantic relationships are used to measure the semantic distance between words, which can be used to disambiguate the meaning of words in texts. Also semantic fields have proved to be very useful for the disambiguation task.
- Ontologies: MultiWordNet can be seen as an ontology to be used for a variety of knowledge-based NLP tasks.
- Terminologies: MultiWordNet constitutes a robust framework supporting the development of specific structured terminologies.
The release 1.1 of MultiWordNet is currently available. It includes information about 51,000 Italian word meanings and 28,000 synsets (incorrespondence with the English equivalents). It also includes a labelling of most WordNet 1.6 synsets with semantic field labels.
Work on MultiWordNet is going on. The next release will contain at least 10,000 new word meanings.
Data are contained in a specialized database server, which can be accessed by clients through a socket connection. The database server has been implemented in Lisp under the Unix and Windows environments. An application program interface and graphical browsing interface are provided with the database. A Java implementation of the database is planned for the next release.
For more information, visit: http://multiwordnet.itc.it