Resource: Original Short-Message Data Collation I in Chinese (PinYin)

Reference Original Short-Message Data Collation I in Chinese (PinYin)
Date of Submission Jan. 24, 2014, 4:30 p.m.
Status accepted
ISLRN 910-780-238-099-2
Resource Type Primary Text
Media Type Text
Source
Language Chinese
Description

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with PinYin transcription.
All data have been proofread manually with PinYin.

Version 1.0
Distributor ELRA