Resource: Large Farsdat

Date of Submission March 7, 2016, 12:35 p.m.
Status accepted
ISLRN 067-486-870-902-0
Resource Type Primary Text
Media Type Audio
Language Persian

Large Farsdat (L-FARSDAT) is a Persian (Farsi) Speech Database containing about 73 hours of read speech from formal Farsi texts (newspapers) which have been recorded by 100 speakers through a unidirectional desktop microphone. Each speaker uttered 20-25 pages of text from various subjects and recording was conducted in a noiseless environment. The average SNR of the desktop microphone is about 28 dB. The sampling rate is 22050 Hz for the whole corpus.
The whole database is segmented and labelled at word and sentence levels with byte count alignment and each word is transcribed according to the 29 standard Persian phonemes.
There are also three labels indicating silence (sil), breathy voice (br) and non-speech sounds (ns).

Version 1.0
Distributor ELRA