Resource: Nepali Spoken Corpus

Reference Nepali Spoken Corpus
Date of Submission June 10, 2014, 10:28 a.m.
Status accepted
ISLRN 688-800-566-571-0
Resource Type Primary Text
Media Type Audio
Source
Language Nepali
Description

OLAC identifier: oai:catalogue.elra.info:ELRA-S0368 Desktop/Microphone The Nepali Spoken Corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (?language communication?), also known as Nelralec, for Nepali Language Resources and Localization for Education and Communication; funded by the EU Asia ITThe Nepali Spoken Corpus is one of the 3 resources that constitute the Nepali National Corpus. The Nepali National Corpus was produced in 2006 in the framework of the project Bhasha Sanchar (?language communication?), also known as Nelralec, for Nepali Language Resources and Localization for Education and Communication; funded by the EU Asia IT&C programme, reference number ASIE/2004/091-777.The design of Nepali Spoken Corpus (NSC) is based on Goteborg Spoken Language Corpus (GSLC). The data are taken from spoken Nepali used in different social activities. The basic assumption of the NSC is that the spoken language differs from written language and it has also different genres as in written language.NSC contains audio recordings from different social activities within their natural settings as much as possible, with phonologically transcribed and annotated texts, and information about the participants. A total of 17 types of activity were recorded. The total temporal duration of the recorded material is 31 hours and 26 minutes.The description of the Nepali Spoken Corpus is provided below:Recorded Activity types: 17Recorded Activity occurrences (files): 115Total time (duration): 31 hours 26 minutesTotal transcribed words (assumed): 260,000Total transcribed files: 115Completely checked: 115As can be seen above, 115 activity occurrences have been recorded belonging to 17 activity types. For instance, the activity type ?shopping? has four recorded occurrences and the activity type ?discussion? has 16 recorded instances. The Nepali Spoken Corpus contains audio recordings from different social activities within their natural settings as much as possible, with phonologically transcribed and annotated texts, and information about the participants. A total of 17 types of activity were recorded. The total temporal duration of the recorded material is 31 hours and 26 minutes.

Version 1.0