Resource: euLEX (Lexical Database for Basque)

Reference euLEX (Lexical Database for Basque)
Date of Submission Jan. 24, 2014, 4:29 p.m.
Status accepted
ISLRN 593-049-611-011-8
Resource Type Lexicon
Media Type Text
Source
Language Basque
Format/MIME Type Plain text
Size 33 mb
Description

euLEX is a general lexicon which contains 115,000 entries, divided into 94,000 dictionary entries or lemmas, 12,000 allomorphs, 7,500 verb forms and about 1,200 dependent morphemes. All entries include linguistic information such as morphology and usage.

The lexicon includes general purpose entries and terms (each of them with its corresponding thematic classification). The non-standard entries, specifically marked by the Academy of the Basque Language, are included and linked to the correct entry.

The lexicon is in XML format and is constantly updated following the latest normalization decisions from the Academy of the Basque Language.

Potential applications for this resource are PoS tagging, lemmatizing, and term detection and correction.

All this information is used as the basis for the automatic lemmatizer tLEMA and the morphological analyzer built up in UZEI, together with the term checker.

Version 1.0
Distributor ELRA