Resource: BITS Logatome Synthesis Corpus – BITS-LG

Reference BITS Logatome Synthesis Corpus – BITS-LG
Date of Submission Jan. 24, 2014, 4:22 p.m.
Status accepted
ISLRN 887-235-135-658-7
Resource Type Primary Text
Media Type Audio
Language German
Size 187 minutes

BITS stands for "BAS Infrastructures for Technical Speech Processing" and was funded by the German Ministry of Science and Education during 2003-2005. The BITS synthesis corpus consists of two parts: a set of logatome recordings for controlled diphone synthesis (ELRA-S0217) and a set of sentence recordings for unit selection techniques (ELRA-S0224).

This corpus contains 11,036 recordings of logatomes spoken by 4 professional German speakers covering all German diphone combinations as well as the most prominent combination German - French – English (each speaker had at least foreign language competence in English).

The data is stored on 4 DVDs. Each DVD contains the recordings, the annotation files and the meta data files of one of the four professional speakers, and the entire corpus' documentation. Each speaker was recorded in an insulated room with low reverberation.

Each logatome was recorded in three channels: close microphone, large membrane microphone and laryngographic signal. All diphones are segmented and labelled into phonemic units.

Total number of recordings: 11036
Total duration: 187 minutes
Format: WAV 48 kHz, 16 bit, Praat TextGrid, BAS Partitur Format (BPF)
Segmentation: extended German SAM-PA

Version 1.0
Distributor ELRA