Full Official Name: German Political Speeches Corpus
This corpus consists of a collection of political speeches in German crawled from the online archive of the German presidency (Bundespra√ęsident) and the Chancellery (Bundesregierung). For the German Presidency the speeches are available from July 1, 1984 to February 17, 2012 and the corpus contains a total of 1,442 texts comprising 2,392,074 tokens. For the German Chancellery, the corpus contains a total of 1,831 texts comprising 3,891,588 tokens covering a period from December 11, 1998 to December 6, 2011. This corpus contains speeches from the Chancellor but also from other politicians. The corpus is released in XML and Unicode format, the files have their own DTD, inspired by TEI guidelines. Tokenisation, POS-Tags and Lemmas are included.

