Resource: HAVIC MED Training Data -- Videos, Metadata and Annotation

Reference HAVIC MED Training Data -- Videos, Metadata and Annotation
Date of Submission Dec. 16, 2021, 5:05 p.m.
Status accepted
ISLRN 265-481-756-640-8
Resource Type Primary Text
Media Type Text, MovingImage
Source
Language English
Format/MIME Type video/mp4, text/plain
Size 776568701 KB
Access Medium Web Download
Description

<h3>Introduction</h3>
<p>HAVIC MED Training Data -- Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 2,100 hours of user-generated videos with annotation and metadata.</p>
<p>To advance multimodal event detection and related technologies, LDC developed, in collaboration with <a href="https://www.nist.gov/">NIST</a> (the National Institute of Standards and Technology), a large, heterogeneous, annotated multimodal corpus for <a href="https://www.ldc.upenn.edu/collaborations/past-projects/havic">HAVIC</a> (the Heterogeneous Audio Visual Internet Collection) that was used in the NIST-sponsored <a href="https://www.nist.gov/itl/iad/mig/trecvid-multimedia-event-detection-evaluation-track">MED</a> (Multimedia Event Detection) task for several years. HAVIC MED Training Data is a subset of that corpus, specifically, a collection of event and background videos for the HAVIC project originally released to support the <a href="https://www.nist.gov/itl/iad/mig/trecvid-multimedia-event-detection-2011-evaluation">2011</a>, <a href="https://www.nist.gov/itl/iad/mig/trecvid-multimedia-event-detection-2012-evaluation">2012</a>, <a href="https://www.nist.gov/itl/iad/mig/med-2013-evaluation">2013</a>, <a href="https://www.nist.gov/itl/iad/mig/med-2014-evaluation">2014</a>, and <a href="https://www.nist.gov/itl/iad/mig/med-2015-evaluation">2015</a> Multimedia Event Detection tasks.</p>
<h3>Data</h3>
<p>The data consists of videos representing various events (event videos) and videos completely unrelated to events (background videos) harvested by a large team of human annotators. Each event video was manually annotated with a set of judgments describing its event properties and other salient features. Background videos were labeled with topic and genre categories.</p>
<p>All video files are in .mp4 format (h.264), with varying bit-rates and levels of audio fidelity and video resolution. Metadata and annotation for the videos are stored in a .tsv file.</p>
<h3>Samples</h3>
<p>Please view this <a href="desc/addenda/LDC2021V01.mp4">video sample</a> and <a href="desc/addenda/LDC2021V01.txt">annotation sample</a></p>
<h3>Updates</h3>
<p>None at this time.</p>
<h3>Additional Licensing Instructions</h3>
<p>This 'members-only' corpus is available to current members. Contact <a href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a> for information about becoming a member.</p>

Version 1.0
Creator Stephanie Strassel , Amanda Morris , Xuansong Li , Brian Antonishek , Jonathan G. Fiscus
Distributor Linguistic Data Consortium
Rights Holder Portions © 2011-2016 YouTube, LLC, © 2011-2016, 2021 Trustees of the University of Pennsylvania