DiaLEX – Egyptian (DiaLEX-EA)

Full Official Name: DiaLEX – Egyptian (DiaLEX-EA)
Submission date: Dec. 4, 2023, 5:13 p.m.

The Egyptian Arabic Full-Form Lexicon (DiaLEX-EA) is a comprehensive computational lexicon covering the Egyptian Arabic dialect. Featuring over 78,000,000 forms for 31,000 lemmas, this full-form lexicon provides exhaustive treatment of all inflected forms. DiaLEX-EA has several features that make it ideally suited to support natural language processing applications for Egyptian Arabic, especially morphological analysis and speech technology, including: 1. Extremely comprehensive coverage – over 78 million entries 2. Comprehensive treatment of all inflected forms, enclitics, proclitics, case endings, declensions, and conjugated forms. 3. Full and accurate diacriticization (vocalization), essential for speech technology. 4. Extensive coverage of variants which is necessary since dialects don't have a standard orthography. Please note: Phonetic transcriptions, IPA and/or SAMPA, fine-tuned to the licensor’s specifications, are available upon request. Quantity and size: 75,204,644 lines / 11,217 MB (11.0 GB) File format: flat TSV text files Samples and a specifications document are available upon request.

Creator(s)
Distributor(s)
Right Holder(s)