Submission date: May 26, 2017, 4:20 p.m.

This is a phonetic lexicon of 21,560 tokens in Pashto with their phonetic transcription in IPA. It covers the major dialect of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381) from which the most frequent words were extracted. The pronunciation dictionary of these words was manually prepared by a native Pashto speaker (Yusufzai dialect) using the IPA Pashto phoneme set. Pashto is an indo-iranian language spoken by the Pashtun people mainly in Pakistan and Afghanistan. A pronunciation dictionary plays a pivotal role both in ASR and TTS systems. The more accurate it is, the more the performance will be good. This pronunciation dictionary has been produced by ELDA as an additional dataset to several corpora produced within the PEA TRAD project supported by the French Ministry of Defence (DGA).

