Resource: Basque FDB-1060 database (SpeechDat-like)

Reference Basque FDB-1060 database (SpeechDat-like)
Date of Submission Jan. 24, 2014, 4:17 p.m.
Status accepted
ISLRN 087-294-618-123-6
Resource Type Primary Text
Media Type Audio
Source
Language Basque
Description

The Basque FDB-1060 database contains the recordings of 1,060 speakers (480 males, 580 females) of Basque recorded over the fixed telephone network. This database is partitioned into 4 CDs. The database complies with the common specifications created in the SpeechDat project.

Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.

Each speaker uttered the following items:

* 6 common application words
* 1 sequence of isolated digits
* 4 digit strings : prompt sheet number, telephone number, credit card number, PIN code
* 3 dates : spontaneous, date (birth date), Prompted date (word style), relative and general date expr.
* 1 application word phrase
* 1 isolated digit
* 3 spelled word : spontaneous, spelled own forename, spelled directory city name, spelled artificial words
* 1 money amount
* 1 natural number
* 5 directory assistance: forename (spontaneous), city of origin (spontaneous), country name (most frequent city), most frequent company/agency name, forename & surname
* 2 spontaneous yes/no questions
* 9 phonetically rich sentences
* 2 time phrases : time of day (spontaneous), time phrase
* 4 phonetically rich words

The following age distribution has been obtained: 8 speakers are under 16, 474 are between 16 and 30, 320 are between 31 and 45, 236 are between 46 and 60, and 13 speakers are over 60. (The age of 9 speakers was not determined.)

A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

Version 1.0
Distributor ELRA