Text corpus of "Le Monde"

Full Official Name: Text corpus of "Le Monde"
Submission date: Jan. 24, 2014, 4:31 p.m.

Electronic archiving of "Le Monde" articles started on 1 January 1987. Some 200 articles are added every day, and as of October 1997 the database contains more than 500,000 articles, making it the biggest of its kind for all French daily newspapers. Years 1987 to 2002 are available in an ASCII text format. Years 2003 to 2007 are available in .XML format. Each month consists of some 10 MB of data (circa 120 MB per year). The number of words available since 2005 is given below: - 2005: 19 million words - 2006: 17 million words - 2007: 21 million words Data ranging from 1987 until 2007 are available through ELRA.

Creator(s)
Distributor(s)
Right Holder(s)