EASy Evaluation Package

Full Official Name: EASy Evaluation Package
Submission date: Jan. 24, 2014, 4:22 p.m.

The EASy Evaluation Package was produced within the French national project EASy (Evaluation of syntactic parsers of French), as part of the Technolangue programme funded by the French Ministry of Research and New Technologies (MRNT). The project enabled to carry out a campaign for the evaluation of syntactic parsers of French. This package includes the material that was used for the EASy evaluation campaign. It includes resources, protocols, scoring tools, results of the campaign, etc., that were used or produced during the campaign. The aim of these evaluation packages is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself. The campaign is distributed over two actions: 1) Evaluation of constituent annotation: it consists in evaluating the ability of parsers with respect to the type of corpus (e.g. literature, conversation transcription, parliamentary speech, questions for information retrieval tools). 2) Evaluation of dependency relation annotation: it consists in evaluating the ability of parsers with respect to the relations between constituents or words. The EASy evaluation package contains the following data and tools: 1) A collection of syntactically tagged French texts gathered over 6 domains (about one million words) : - medicine: 100,000 words, including 5,000 annotated words, - literature: 150,000 words, including 15,000 annotated words, - emails: 2,250 anonymised personal emails (121,000 words), - general: 250,000 words, including 24,000 annotated words, extracted from Le Monde newspaper, reports from the French Senate and the European Assembly (MLCC, MultiLingual Corpora for Co-operation, catalogue ref: ELRA-W0023), - speech: 10 passages of transcribed dialogues from the Spoken French corpus (8,000 annotated words), - questions: corpus of 137,000 words, extracted from the TREC and AMARYLLIS campaigns, including 5,000 annotated words. 2) PASTK++: gathers evaluation tools for constituents and relations. It includes a version of the EASy campaign tools that were modified during the PASSAGE campaign (which followed the EASy campaigns). 3) Visualization tools for constituents and relations A description of the project is available at the following address: http://www.technolangue.net/article.php3?id_article=198 (in French language)

Right Holder(s)