Resource: GlobalPhone Multilingual Model Package

Reference GlobalPhone Multilingual Model Package
Date of Submission Oct. 9, 2018, 11:53 a.m.
Status accepted
ISLRN 204-945-263-927-6
Resource Type Primary Text
Media Type Audio
Source
Language Arabic, Bulgarian, Chinese, Croatian, French, German, Hausa, Japanese, Korean, Polish, Portuguese, Russian, Spanish, Swahili (macrolanguage), Swedish, Tamil, Thai, Turkish, Ukrainian, Vietnamese
Size 22 hours
Access Medium Downloadable
Description

The GlobalPhone Multilingual Model Package contains about 22 hours of transcribed read speech spoken by native speakers in 22 languages. The data are sampled from the GlobalPhone Speech and Text Data available in the ELRA Catalogue, i.e.: Arabic (ELRA-S0192), Bulgarian (ELRA-S0319), Chinese-Mandarin (ELRA-S0193), Chinese-Shanghai (ELRA-S0194), Croatian (ELRA-S0195), Czech (ELRA-S0196), French (ELRA-S0197), German (ELRA-S0198), Hausa (ELRA-S0347), Japanese (ELRA-S0199), Korean (ELRA-S0200), Polish (ELRA-S0320), Portuguese (Brazilian) (ELRA-S0201), Russian (ELRA-S0202), Spanish (Latin America) (ELRA-S0203), Swahili (ELRA-S0375), Swedish (ELRA-S0204), Tamil (ELRA-S0205), Thai (ELRA-S0321), Turkish (ELRA-S0206), Ukrainian (ELRA-S0377), and Vietnamese (ELRA-S0322).

The GlobalPhone Multilingual Model Package covers about 1 hour of transcribed speech from 10 speakers (5 male, 5 female) from each of the above listed 22 languages, i.e. on average about 6 minutes or about 41 utterances per speaker from a total of 220 speakers. The package is designed for various tasks in multilingual speech processing research and development, such as (1) multilingual acoustic modeling, (2) multilingual speech synthesis, (3) automatic dictionary generation in multiple languages, and (4) multilingual speech processing with low resources.

Version 1.0
Distributor ELRA