Resource: TRAD Pashto-French Parallel corpus of transcribed Broadcast News Speech - Test data

Reference TRAD Pashto-French Parallel corpus of transcribed Broadcast News Speech - Test data
Date of Submission April 6, 2016, 4:51 p.m.
Status accepted
ISLRN 547-897-479-723-3
Resource Type Primary Text
Media Type Text
Source
Language French, Pushto
Size 1.12 Mb
Description

This is a parallel corpus, which contains 10,000 Pashto words translated into French by two different translators. The source texts come from 3 broadcast news transcriptions of the TRAD Pashto Broadcast News Speech Corpus (ELRA-S0381). These texts are VOA Ashna TV programs recorded on 15/01/2011, 18/01/2011 and 19/01/2011. These translations are different from the one provided in the TRAD Pashto-French Parallel corpus of transcribed Broadcast News Speech - Training set (ELRA-W0093).

The content has also been translated into English (see ELRA-W0095 TRAD Pashto-English Parallel corpus of transcribed Broadcast News Speech).

Pashto is an indo-iranian language spoken by the Pashtun people mainly in Pakistan and Afghanistan.

This corpus was produced by ELDA within the PEA TRAD project supported by the French Ministry of Defence (DGA). It was used as a test set for an internal MT evaluation campaign.

Version 1.0
Distributor ELRA