Resource: Eleftherotypia Journal Speech database

Reference Eleftherotypia Journal Speech database
Date of Submission Jan. 24, 2014, 4:22 p.m.
Status accepted
ISLRN 006-646-762-561-1
Resource Type Primary Text
Media Type Audio
Source
Language Greek, Modern (1453-)
Description

The Eleftherotypia Speech Database (13 CD-ROMs) consists of read material collected in order to be used for the development of continuous speech recognition systems for the Greek language. All recorded sentences were selected from extracts of the Elefterotypia-journal text corpus and provide a vocabulary of about 40,000 words. The total number of utterances is over 32,000 (aproximately 72 hours of speech material from 120 different speakers, male and female).

Detailed orthographic transcription files are also included in the distribution. There are markings for the utterance's orthography and several speech and non-speech events (e.g. mispronunciations, truncation, noise etc).

The recording procedure took place in three different environments : a sound proof room, a quiet environment and an office environment. Two different microphones were used : a desk microphone and a head-mounted close-talking microphone. The format of the waveform files is NIST. Waveforms are encoded using PCM coding format, 16000 sampling rate, 2 bytes per sample.

Version 1.0
Distributor ELRA