Resource: MAURDOR Evaluation Package
|Reference||MAURDOR Evaluation Package|
|Date of Submission||Feb. 26, 2015, 5:52 p.m.|
|Resource Type||Primary Text|
|Language||Arabic, English, French|
The MAURDOR project consists in evaluating systems for automatic processing of written documents. Collected written documents are scanned documents (printed, typewritten or manuscripts).
In order to get images for the evaluation of automatic analysis systems, 10,000 original documents were collected and annotated (5000 in French, 2500 in English and 2500 in Arabic). This package contains 8,129 documents out of the 10,000 originally collected.
Each of the 8129 documents belongs to one of the 5 following categories:
Once collected, those documents were submitted to a manual annotation. This human analysis is used as a reference, known as ground truth, for the training and evaluation of automatic processing systems.
Annotations aim to highlight the following information:
The MAURDOR evaluation campaign provides a common framework for the reporting of current performances of systems for automatic processing of digital documents. This package contains the material provided to the campaign participants:
The documents are provided in TIFF format and the annotations are provided in XML format.
The aim of this evaluation package is to enable external players to evaluate their own system and compare their results with those obtained during the campaign itself.