MEDAR Evaluation Package

Full Official Name: MEDAR Evaluation Package
Submission date: Jan. 24, 2014, 4:30 p.m.

The MEDAR Evaluation Package was produced within the project MEDAR (MEDiterranean ARabic language and speech technology), supported by the European Commission's ICT programme and which has been running from February 1st 2008 until July 31st 2010. The project addressed International Cooperation between the European Union and the Mediterranean region on Speech and Language Technologies (SLT) for Arabic. This evaluation package aims to enable the evaluation of SLT/MT (Machine Translation) systems for translation tasks applying to the English-to-Arabic direction. The package consists of two SMT baseline systems and all necessary resources for the evaluation of machine translation for the English-to-Arabic direction. The package reflects the outcome of the dry-run and evaluation campaign carried out in February and July 2010. It contains the training and test data, the reference translations, documentation and tools that enable to score a system output. Tools are split in four categories: - An alignment package including Hunalign, Champollion Tool Kit (CTK) and formatting scripts - Evaluation metrics to evaluate MT output against reference translations: BLEU/NIST and WER - Formatting scripts to convert XML files to raw in keeping tag information so as to realize the back conversion - Two MT baseline systems that use MOSES (see for more information about MOSES) Data concerns: - the results of the morphosyntactic disambiguation and sentence and word alignment on the English-Arabic parallel corpus of the dry-run - the source corpora and their reference translations used for the MEDAR dry-run and evaluation campaign - the monolingual and parallel training data to train MT systems - the judge assessments of the dry-run and campaign human evaluations The full package is stored on 1 DVD.

Right Holder(s)