ISLRN

HAVIC MED Training Data -- Videos, Metadata and Annotation

Full Official Name: HAVIC MED Training Data -- Videos, Metadata and Annotation

Submission date: Dec. 16, 2021, 5:05 p.m.

<h3>Introduction</h3> <p>HAVIC MED Training Data -- Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 2,100 hours of user-generated videos with annotation and metadata.</p> <p>To advance multimodal event detection and related technologies, LDC developed, in collaboration with <a href="https://www.nist.gov/">NIST</a> (the National Institute of Standards and Technology), a large, heterogeneous, annotated multimodal corpus for <a href="https://www.ldc.upenn.edu/collaborations/past-projects/havic">HAVIC</a> (the Heterogeneous Audio Visual Internet Collection) that was used in the NIST-sponsored <a href="https://www.nist.gov/itl/iad/mig/trecvid-multimedia-event-detection-evaluation-track">MED</a> (Multimedia Event Detection) task for several years. HAVIC MED Training Data is a subset of that corpus, specifically, a collection of event and background videos for the HAVIC project originally released to support the <a href="https://www.nist.gov/itl/iad/mig/trecvid-multimedia-event-detection-2011-evaluation">2011</a>, <a href="https://www.nist.gov/itl/iad/mig/trecvid-multimedia-event-detection-2012-evaluation">2012</a>, <a href="https://www.nist.gov/itl/iad/mig/med-2013-evaluation">2013</a>, <a href="https://www.nist.gov/itl/iad/mig/med-2014-evaluation">2014</a>, and <a href="https://www.nist.gov/itl/iad/mig/med-2015-evaluation">2015</a> Multimedia Event Detection tasks.</p> <h3>Data</h3> <p>The data consists of videos representing various events (event videos) and videos completely unrelated to events (background videos) harvested by a large team of human annotators. Each event video was manually annotated with a set of judgments describing its event properties and other salient features. Background videos were labeled with topic and genre categories.</p> <p>All video files are in .mp4 format (h.264), with varying bit-rates and levels of audio fidelity and video resolution. Metadata and annotation for the videos are stored in a .tsv file.</p> <h3>Samples</h3> <p>Please view this <a href="desc/addenda/LDC2021V01.mp4">video sample</a> and <a href="desc/addenda/LDC2021V01.txt">annotation sample</a></p> <h3>Updates</h3> <p>None at this time.</p> <h3>Additional Licensing Instructions</h3> <p>This 'members-only' corpus is available to current members. Contact <a href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a> for information about becoming a member.</p>

Creator(s)

Amanda Morris

Stephanie Strassel

Xuansong Li

Brian Antonishek

Jonathan G. Fiscus

Distributor(s)

Linguistic Data Consortium

Right Holder(s)

Status : Accepted

ISLRN :

265-481-756-640-8

Version

1.0

Source

https://catalog.ldc.upenn.edu/LDC2021V01

Resource Type

Primary Text

Media Type

Movingimage

Text

Language(s)

English

Access Medium

Web Download