KAIROS Phase 1 Evaluation Source Data, Annotation, Assessment

Full Official Name: KAIROS Phase 1 Evaluation Source Data, Annotation, Assessment
Submission date: June 9, 2026, 5:44 p.m.

**Introduction** KAIROS Phase 1 Evaluation Source Data, Annotation, and Assessment (LDC2026T07) was developed by the Linguistic Data Consortium (LDC). It contains the English and Spanish source data (text, video and images), manual annotations, reference knowledge graphs, the system output assessed during the evaluation, and human assessment results from the Phase 1 evaluation of the DARPA KAIROS Program. The Phase 1 evaluation focused on the improvised explosive bombing scenario with nine complex events (CEs) and two surprise complex events in the mass shooting scenario: ce1005: Sidney Aeroplane Bomb Plot, Australia, 2017 ce1006: Stockholm Bombings, Sweden, 2010 ce1007: Manchester Arena Bombing, England, 2017 ce1008: Taxi Detonation, Canada, 2016 ce1009: Spokane Bombing Attempt, Washington, 2011 ce1010: Derry Bombing, Northern Ireland, 2019 ce1011: Bogotá Police Academy Car Bombing, Colombia, January 2019 ce1012: Kansas City Hospital Bombing, Missouri, 2020 ce1013: Attempted bombing in Moses Lake, Washington, 2018 ce1020: El Paso Walmart Shooting, Texas, 2019 ce1021: Orlando nightclub shooting, Florida, 2016 The DARPA KAIROS (Knowledge-directed Artificial Intelligence Reasoning Over Schemas) program aimed to build technology capable of understanding and reasoning about complex real-world events in order to provide actionable insights to end users. KAIROS systems utilized formal event representations in the form of schema libraries that specified the steps, preconditions and constraints for an open set of complex events; schemas were then used in combination with event extraction to characterize and make predictions about real-world events in a large multilingual, multimedia corpus. Each KAIROS evaluation focused on a real-world scenario and several real-world complex events within that scenario, along with the possibility of surprise complex events in different but related scenarios. **Data** Source data was collected from the web by LDC. A total of 139 root web pages were collected and processed, yielding 131 text data files, 1176 image files, and 27 video files. The evaluation source data for each conplex event was an input data set consisting of 10-15 documents that included multimodal English and Spanish event-relevant and off-topic distractor documents. Manual annotation and assessment of event-relevant documents for 10 complex events are included in this release. Scenario-relevant events and relations were labeled for each document to develop a structured representation of temporally-ordered events, relations and arguments that expressed the scenario-relevant events in each conplex event. A reference knowledge graph (Graph G) was developed for each event; systems were expected to match the Graph G with a given schema library. Assessment data includes human asessment judgments and the system output that was manually assessed for the end-to-end evaluation task. Source data is presented in various formats: .gif, .jpg,. ltf, .mp4, .png, .psm, and .svg. Annotations are presented as tab separated files (.tab). Graph G data is presented in JSON format and in human-readable Excel (.xlsx) files. System output is presented in JSON format and as tab separated files.

Creator(s)
Distributor(s)
Right Holder(s)