ISLRN

TGermaCorp

Full Official Name: TGermaCorp

Submission date: March 9, 2017, 6:02 p.m.

TGermaCorp is a digital humanities resource built around German literature texts from the sixteenth century to the present. The primary texts are annotated on four levels: Firstly, the parts of speech are tagged according to STTS. Secondly, each token is assigned to its lemma. Thirdly, proper names are classified according to the kind of their referent (e.g., person or institution). Fourthly, clauses, sentences and paragraphs and headings are explicitly marked. One specific characteristic of TGermaCorp is the composition of its primary sources: TGermaCorp is designed in view of capturing the lexical and morpho-syntactic varieties of written German as exhibited in German-speaking literature. Hence, TGermaCorp aims at applications and investigations within the field of Digital Humanities and therefore is located in the low-resource intersection area between computational linguistics and the study of literature.

Creator(s)

Text Technology Lab, Goethe University Frankfurt

Distributor(s)

Text Technology Lab, Goethe University Frankfurt

Right Holder(s)

Text Technology Lab, Goethe University Frankfurt

Alexander Mehler

Status : Accepted

ISLRN :

536-382-801-278-5

Version

v0.2

Source

https://hucompute.org/applications/corpora/

Resource Type

Primary Text

Media Type

Text

Language(s)

German

Access Medium

Zip