Resource: PAROLE Irish Distributable Corpus

Reference PAROLE Irish Distributable Corpus
Date of Submission Jan. 24, 2014, 4:30 p.m.
Status accepted
ISLRN 628-474-388-133-4
Resource Type Primary Text
Media Type Text
Source
Language Irish
Description

The PAROLE Irish Distributable Corpus consists of over 8 million words (a subset of the 15+ million words Irish Reference corpus).

The text is marked-up in accordance with the PAROLE encoding standard which incorporates the Corpus Encoding Standard (CES) and Text Encoding Initiative (TEI) Guidelines. All the files are in SGML format with a detailed header and the body of the text tagged to paragraph level. The header includes information such as title, author(s), number of words, ownership, publication details and also a standard coding for Medium, Topic and Genre categories.

A subset of the Distributable Corpus is morpho-syntactically tagged.

Included in this distribution is approximately 3,000 manually checked words.

Version 1.0
Distributor ELRA