Resource: Arbobanko (Esperanto Treebank)

Reference Arbobanko (Esperanto Treebank)
Date of Submission Nov. 18, 2019, 2:44 p.m.
Status accepted
ISLRN 185-602-618-699-2
Resource Type Primary Text
Media Type Text
Source
Language Esperanto
Format/MIME Type text/sgml, text/xml, text/tab-separated-values
Size 52000 tokens
Access Medium downloadable
Description

The Arbobanko (Esperanto Treebank) is a 52,000 token dependency treebank of Esperanto with texts from the MONATO news magazine, consisting of random excerpts from the period 2000-2010. All words were annotated for lemma, part-of-speech, inflection, compounding and affixing, syntactic function, dependency links, NER types, semantic types of nouns and adjectives, and verb frame categories.

Morphosyntactic and dependency annotation was performed with the EspGram parser, and manually revised. Semantic categories were added in a second round of annotation, and are also manually revised and disambiguated.

The format is native Constraint Grammar sgml, with token-based tag lines, xml with feature-attribute pairs or CoNNL tab format.

Version 1.0
Distributor ELRA