Resource: Collins Multilingual database (MLD) - WordBank with audio files
|Reference||Collins Multilingual database (MLD) - WordBank with audio files|
|Date of Submission||July 18, 2016, 5:50 p.m.|
|Language||Arabic, Chinese, Croatian, Czech, Danish, Dutch, Flemish, English, Finnish, French, German, Greek, Modern (1453-), Italian, Japanese, Korean, Norwegian, Polish, Portuguese, Russian, Spanish, Castilian, Swedish, Thai, Turkish, Vietnamese|
The Collins Multilingual database covers Real Life Daily vocabulary. It is composed of a multilingual lexicon in 32 languages (the WordBank, see ELRA-T0376) and a multilingual set of sentences in 28 languages (the PhraseBank, see ELRA-T0377).
This version includes the corresponding audio files covering 26 languages of the 32 languages available in the Collins MLD Wordbank: Arabic, Chinese, Croatian, Czech, Danish, Dutch, American English, British English, Finnish, French, German, Greek, Italian, Japanese, Korean, Norwegian, Polish, Portuguese (Iberian), Portuguese (Brazilian), Russian, Spanish (Iberian), Spanish (Latin American), Swedish, Thai, Turkish, Vietnamese.
The WordBank contains 10,000 words for each language, XML-annotated for part-of-speech, gender, irregular forms and disambiguating information for homographs. An additional dataset of 10,000 headwords is included for 12 languages (Chinese, American and British English, French, German, Italian, Japanese, Korean, Iberian and Brazilian Portuguese, Iberian and Latin American Spanish).
The full database contains 10,000 audio files for each language (26 languages), and 10,000 additional audio files corresponding to the 10,000 additional headwords in 12 languages.
Audio was recorded by native speakers.