Resource: British English SpeechDat(II) FDB-4000

Reference British English SpeechDat(II) FDB-4000
Date of Submission Jan. 24, 2014, 4:22 p.m.
Status accepted
ISLRN 575-262-304-348-7
Resource Type Primary Text
Media Type Audio
Source
Language English
Description

The British English SpeechDat(II) FDB-4000 database contains the recordings of 4,000 British English speakers (1,968 males, 2,032 females) recorded over the British fixed telephone network. This database is partitioned into 20 CDs.

Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.

This speech database was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat format and content specifications.

Each speaker uttered the following items:

* 1 isolated single digit
* 1 sequence of 10 isolated digits
* 4 connected digits (1 sheet number -6 digits, 1 telephone number –9/11 digits, 1 credit card number -16 digits, 1 PIN code -6 digits)
* 1 spontaneous phone number
* 1 currency money amount
* 1 natural number
* 3 dates (1 spontaneous e.g. birthday, 1 prompted date, 1 relative or general date expression)
* 2 time phrases (1 spontaneous time of day, 1 word style time phrase)
* 3 spelled words (1 spontaneous e.g. own forename, 1 city name, 1 real word for coverage)
* 5 directory assistance names (1 spontaneous e.g. own forename, 1 city of birth/growing up, 1 frequent city name, 1 frequent company name, 1 common forename and surname)
* 2 yes/no questions (1 predominantly "yes" question, 1 predominantly "no" question)
* 3 application words
* keyword phrase using an embedded application word
* 4 phonetically rich words
* 9 phonetically rich sentences

The following age distribution has been obtained: 1,242 speakers are between 16 and 30, 1,321 speakers are between 31 and 45, 1,298 speakers are between 46 and 60, and the age of 139 speakers is unknown.

A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

Version 1.0
Distributor ELRA