Mandarin Chinese Speech Recognition Corpus (desktop) - digit string (119 people)

Full Official Name: Mandarin Chinese Speech Recognition Corpus (desktop) - digit string (119 people)
Submission date: Jan. 24, 2014, 4:30 p.m.

This corpus comprises 3,570 speech files uttered by 119 speakers of different dialects, ages and various educational levels, recorded over 3 channels (Mic 1: SHURE Beta53; Mic 2: AKG C4000b; Mic 3: Labtec Axis 002). The database comprises 1,500 digit strings in total. Speech samples are stored as a sequence of 16-bit 48kHz WAV for 7.54 hours of speech per channel. The total capacity of the data is 7.28 Gb. Text files are stored in Unicode format. All data have been proofread manually. The corpus aims to be applied to the testing and telephone natural speech recognition system.

Creator(s)
Distributor(s)
Right Holder(s)