Resource: Greek SpeechDat(II) FDB-5000

Reference Greek SpeechDat(II) FDB-5000
Date of Submission Jan. 24, 2014, 4:29 p.m.
Status accepted
ISLRN 421-018-015-047-2
Resource Type Primary Text
Media Type Audio
Source
Language Greek, Modern (1453-)
Description

The Greek SpeechDat(II) FDB-5000 database contains the recordings of 5,000 Greek speakers (2,405 males, 2,595 females) recorded over the Greek fixed telephone network.The FDB-5000 database is partitioned into 25 CDs in ISO 9660 format.

Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.

This speech database was validated by SPEX (the Netherlands), to assess its compliance with the SpeechDat format and content specifications.

Each speaker uttered the following items:

* 2 isolated digits
* 1 sequence of 10 isolated digits
* 7 connected digits (1 prompt sheet number -5+ digits, 1 telephone number –9/11 digits, 1 credit card number –14/16 digits, 1 PIN code -6 digits, 1 long number greater than 999999, 1 decimal number, 1 age)
* 3 dates (1 spontaneous date e.g. birthday, 1 word style prompted date, 1 relative and general date expression)
* 1 word spotting phrase using an embedded application word
* 3 application words
* 3 spelled words (1 spontaneous name e.g. own forename, 1 city name, 1 real/artificial word for coverage)
* 1 currency money amount
* 1 natural number
* 7 directory assistance names (1 name e.g. forename, 1 city of birth/growing up, set of 150 SDB full names, 1 most frequent cities, 1 most frequent company/agency, 1 city/region of call, 1 profession)
* 4 yes/no questions
* 1 fuzzy yes/no question that could have either yes/no or something else as an answer
* 9 phonetically rich sentences
* 2 time phrases (1 spontaneous time of day, 1 word style time phrase)
* 4 isolated words
* 1 male/female
* 1 telephone model
* 1 environment of call
* 5 words broken into syllables

The following age distribution has been obtained: 512 speaker are under 16, 2,555 speakers are between 16 and 30, 1,199 speakers are between 31 and 45, 653 speakers are between 46 and 60, 74 speakers are over 60, and the age of 7 speakers is unknown.

A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

Version 1.0
Distributor ELRA