Czech WordNet

Full Official Name: Czech WordNet
Submission date: Jan. 24, 2014, 4:22 p.m.

The Czech WordNet was developed by the Centre of Natural Language Processing at the Faculty of Informatics, Masaryk University, Czech Republic. The Czech WordNet captures nouns, verbs, adjectives, and partly adverbs, and contains 28,201 word senses (synsets). Every synset encodes the equivalence relation between several literals (at least one is present), having a unique meaning (specified in the SENSE tag value), belonging to one and the same part of speech (specified in the POS tag value), and expressing the same lexical meaning. Each Czech synset is related to the corresponding synset in the Princeton WordNet 2.0. via its identification number ID. There is at least one language-internal relation (there could be more) between a synset and another synset in the database. The Czech WordNet is a language-internal structure, minimally containing • A set of variants or synonyms making up the synset; • part-of-speech; • language-internal relations to other synsets; • a unique-id linking the synset to the Princeton Wordnet 2.0. It consists of: • Number of Synsets: 28,201 (+258 Cze) • Number of Literals: 43,958 • Domain specific synsets:27,897 • Lexico-semantic relations: 34,267 • Extralinguistic relations: 28,201 a) The Czech WordNet is distributed with: • morpho-syntactic properties – surface and deep verb frames – 824; • examples accompanying the verb frames. b) The Czech WordNet is distributed without: • glosses; • usage labels; • examples. VisDic and DEBVisDic, software tools for browsing and editing WordNet-like databases written in XML format, are accessible from the web addresses: http://nlp.fi.muni.cz/projects/visdic and http://deb.fi.muni.cz/debvisdic, and can be downloaded in the version for MS Windows and Linux. From these pages Princeton WordNet 2.0 can be downloaded in the XML format as well.

Creator(s)
Distributor(s)
Right Holder(s)