Resource: CORDIAL-SIN

Reference Syntax-oriented Corpus of Portuguese Dialects (CORDIAL-SIN)
Date of Submission Aug. 28, 2014, 4:37 p.m.
Status accepted
ISLRN 144-935-399-699-8
Resource Type Primary Text
Media Type Text, Audio
Source
Language Portuguese
Format/MIME Type text/txt, audio
Size 640,000 words / c. 70 hours
Access Medium Downloadable online.
Description

The CORDIAL-SIN is an annotated dialect corpus of European Portuguese. The materials for this corpus were drawn from the recordings of dialect speech collected by CLUL as fieldwork interviews for linguistic atlases between 1974 and 2004. The informants were aged, received little instruction, lived in a rural area, and were born and raised in the location of the interview. The CORDIAL-SIN compiles a geographically representative body of selected excerpts of spontaneous and semi-directed speech from these interviews. The corpus contains c. 640,000 words, collected from 42 locations within the continental territory of Portugal and the archipels of Madeira and Azores.
The CORDIAL-SIN is available online and downloadable as written data, in four different formats: two kinds of orthographic transcripts (more or less detailed for the marking up of spoken language phenomena), part-of-speech (PoS) tagged corpus and syntactically annotated corpus.

Version 2010
Creator Ernestina Carrilho , Ana Maria Martins - CLUL
Distributor Ana Maria Martins - CLUL
Rights Holder Ana Maria Martins - CLUL