Resource: CORDIAL-SIN – Treebank

Reference CORDIAL-SIN – Syntax-oriented Corpus of Portuguese Dialects – Treebank
Date of Submission Dec. 17, 2020, 2:18 p.m.
Status accepted
ISLRN 337-389-991-117-2
Resource Type Treebank
Media Type Text
Source
Language Portuguese
Format/MIME Type text
Size 93007 syntactic parse trees; 879561 tokens (with empty categories)
Access Medium Downloadable (open access)
Description

The CORDIAL-SIN–TreeBank is a collection of 93007 syntactic parse trees of the Syntax-oriented Corpus of Portuguese Dialects. CORDIAL-SIN is a corpus of spoken dialectal European Portuguese, developed at Centro de Linguística da Universidade de Lisboa, that compiles excerpts of spontaneous and semi-directed speech selected from fieldwork interviews carried out in 42 locations within the Portuguese territory (see also CORDIAL-SIN in this repository). CORDIAL-SIN syntactic annotation follows the constituency-based system originally developed for the Penn Parsed Corpora of Historical English and it provides the marking up of constituent boundaries, phrase and clause dependencies, categorial information, grammatical relations, discoursive functions, sentence and clause types, some null constituents and certain transformational relations. The annotation guidelines for Portuguese were established in close cooperation with the Tycho Brahe Parsed Corpus of Historical Portuguese team (cf. Portuguese syntactic annotation manual). CORDIAL-SIN syntactic annotation uses labeled bracketing representations and it is fully searchable with search engines as CorpusSearch.

Version v.1
Creator Catarina Magro - Centro de Linguística da Universidade de Lisboa , Ernestina Carrilho - Centro de Linguística da Universidade de Lisboa , Ana Maria Martins - Centro de Linguística da Universidade de Lisboa
Distributor Catarina Magro - Centro de Linguística da Universidade de Lisboa , Ernestina Carrilho - Centro de Linguística da Universidade de Lisboa , Ana Maria Martins - Centro de Linguística da Universidade de Lisboa
Rights Holder Catarina Magro - Centro de Linguística da Universidade de Lisboa , Ernestina Carrilho - Centro de Linguística da Universidade de Lisboa , Ana Maria Martins - Centro de Linguística da Universidade de Lisboa
Relation superset of "CORDIAL-SIN – Syntax-oriented Corpus of Portuguese Dialects; ISLRN 144-935-399-699-8"