Resource: Original Short-Message Data Collation I in Chinese (named entities)

Reference Original Short-Message Data Collation I in Chinese (named entities)
Date of Submission Jan. 24, 2014, 4:30 p.m.
Status accepted
ISLRN 169-161-744-054-8
Resource Type Primary Text
Media Type Text
Source
Language Chinese
Description

This corpus comprises 5,891,275 characters, corresponding to 51,568 short messages (SMS) from radio/TV stations and 213,694 daily life short messages. This subset contains original messages together with named entities.
All data have been proofread and tagged manually.

Version 1.0
Distributor ELRA