Danish Propbank

Full Official Name: Danish Propbank
Submission date: April 6, 2017, 3:54 p.m.

The Danish Propbank (DPB) is a multi-layer treebank, annotated not only with morphosyntactic, but also with semantic information, in particular propositions/frames with VerbNet classes and semantic roles for both arguments and satellites. In addition, the corpus has been annotated with 20 Named Entity classes and a 200-category semantic ontology for nouns. The text samples are taken from Korpus 2010, compiled by the Society for Danish Language and Literature (http://korpus.dsl.dk/resources.html), and contain samples of written Danish from a variety of both formal and informal texts, such as newspapers, magazines, blogs, chat fora and parliamentary debates. The treebank consists of about 87,000 tokens. There are over 12,000 frames with 32,000 role instances. It can be regarded as a semantic sister treebank complementing the older Arboretum treebank (see ELRA-W0084). The two data sets also complement each other with regard to time periods and text types, together covering 3 decades of Danish text.

