KAIROS Phase 1 Quizlet

Full Official Name: KAIROS Phase 1 Quizlet
Submission date: Aug. 12, 2025, 5:10 p.m.

**Introduction** KAIROS Phase 1 Quizlet (LDC2025T11) was developed by the Linguistic Data Consortium (LDC). It contains English and Spanish text, video and image data and annotations used for pre-evaluation research and system development during Phase 1 of the DARPA KAIROS program. KAIROS Quizlets were a series of narrowly defined tasks designed to explore specific evaluation objectives enabling KAIROS system developers to exercise individual system components on a small data set prior to the full program evaluation. This corpus contains the complete set of Quizlet data used in Phase 1 which focused on two real-world complex events (CEs) within the Improvised Explosive Device bombing scenario: CE1001 (2018 Caracas drone attack) and CE1002 (Utah High School backpack bombing). The DARPA KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) program aimed to build technology capable of understanding and reasoning about complex real-world events in order to provide actionable insights to end users. KAIROS systems utilized formal event representations in the form of schema libraries that specified the steps, preconditions and constraints for an open set of complex events; schemas were then used in combination with event extraction to characterize and make predictions about real-world events in a large multilingual, multimedia corpus. **Data** Four quizlets were developed in Phase 1. In additon to the source documents, this release contains the contents of Quizlet 3 (graph G annoation generated with manual annotation) and Quizlet 4 (source documents, manual annotation, updated graph G). Quizlet 1 (evaluation task introduction) did not require data or annotation and is not included in this release. Quizlet 2 (schema generation and instantiation) used source documents but did not include annotation. Source data was collected from the web by LDC. 30 root web pages were collected and processed, yielding 29 text data files, 216 image files and 5 video files. Annotation steps included labeling scenario-relevant events and relations for each document to develop a structured representation of temporally-ordered events, relations and arguments. Source data is presented in various formats: .gif, .jpg,. ltf, .mp4, .png, .psm, and .svg. Annotations are presented as tab separated files (.tab) for temporal ordering, relations, events, and arguments. Software tools are also included in this release. The tools recreate original source data from the processed XML material. ltf2rsd.perl -- convert ltf.xml files to rsd.txt (raw-source-data) ltfzip2rsd.perl -- extract and convert ltf.xml files from zip archives Directory Structure Please see file.tbl for a complete file list as well as checksums for this publication. data/ - Contains the corpus data subdivided by annotation and source. docs/ - Contains additional documentation, guidelines, dtds for XML formats, and a file table. tools/ - Contains the tools mentioned above. Sponsorship KAIROS was sponsored by the Air Force Research Laboratory (AFRL) and the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR0011-19-S-0014. Updates No updates at this time.

Creator(s)
Distributor(s)
Right Holder(s)