|Date of Submission||March 9, 2017, 6:02 p.m.|
|Resource Type||Primary Text|
|Size||244 documents, 8.941 sentences, 157.210 tokens|
TGermaCorp is a digital humanities resource built around German literature texts from the sixteenth century to the present. The primary texts are annotated on four levels: Firstly, the parts of speech are tagged according to STTS. Secondly, each token is assigned to its lemma. Thirdly, proper names are classified according to the kind of their referent (e.g., person or institution). Fourthly, clauses, sentences and paragraphs and headings are explicitly marked. One specific characteristic of TGermaCorp is the composition of its primary sources: TGermaCorp is designed in view of capturing the lexical and morpho-syntactic varieties of written German as exhibited in German-speaking literature. Hence, TGermaCorp aims at applications and investigations within the field of Digital Humanities and therefore is located in the low-resource intersection area between computational linguistics and the study of literature.
|Creator||Andy Lücking , Text Technology Lab, Goethe University Frankfurt|
|Distributor||Andy Lücking , Text Technology Lab, Goethe University Frankfurt|
|Rights Holder||Andy Lücking , Text Technology Lab, Goethe University Frankfurt|