Resource: Arabic Speech Corpus
|Reference||Arabic Speech Corpus|
|Date of Submission||Aug. 19, 2016, 10:57 a.m.|
This speech corpus has been developed as part of a PhD work carried out by Nawar Halabi at the University of Southampton. The corpus was recorded through a Neumann TLM 103 Studio Microphone by one male speaker in South Levantine Arabic (Damascian accent) in a professional studio. The transcript was collected from “Aljazeera Learn” (Aljazeera 2015), a language learning website which was chosen because it contained fully diacritised text which makes it easier to phonetise. The transcript was split into utterances based on punctuation, to make it easier for the speaker during the recording sessions. Synthesized speech as an output using this corpus has produced a high quality, natural voice. It consists of 1813 utterances for a total of 3.7 hours consisting of:
This package corresponds to version 2.0 of the corpus and includes:
Arabic Speech Corpus by Nawar Halabi is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.