Resource: Danish SpeechDat(M) database - DB2
|Reference||Danish SpeechDat(M) database - DB2|
|Date of Submission||Jan. 24, 2014, 4:22 p.m.|
|Resource Type||Primary Text|
The (polyphone-like) Danish SpeechDat(M) database contains the recordings of 1,523 Danish speakers from 11 regions.
Speech samples are stored as sequences of 8 bit 8 kHz A-law. Each prompted utterance is stored in a separate file, and the associated label files are stored in SAM file format.
Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information. It was validated by SPEX (the Netherlands) to assess its compliance with the SpeechDat format and content specifications.
The lexicon is presented in a TAB delimited ASCII file containing an alphabetically ordered list of distinct lexical items occurring in the database. Each entry contains a frequency count and corresponding pronunciation information.
The complete Danish SpeechDat database is partitioned into 5 CD-ROMs. The first three CD-ROMs contain the application oriented sub-set. The last two CD-ROMs contain the phonetically rich sentences.
Each speaker uttered the following items:
* 5 semi-spontaneous application word phrases
The 5 age groups are the following: under 16, 16-30, 31-45, 46-60, over 60. 78% of the speakers are between 16 and 60 years old.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.