Resource: NRC Emotion Lexicon - Revised version

Reference NRC Emotion Lexicon - Revised version
Date of Submission Sept. 23, 2021, 10:05 a.m.
Status accepted
ISLRN 007-544-786-822-8
Resource Type Lexicon
Media Type Text
Source
Language English
Format/MIME Type csv
Size 5,916 entries
Description

The NRC Emotion Lexicon was originally built by Saif M. Mohammad and Peter D. Turney through crowdsourcing. The NRC was created in order to assist with emotion analysis as other emotion lexicons were smaller at the time. In order to be able to fix this problem, Saif crowdsourced a huge collection of terms obtained from sources like the Macquarie Thesaurus, Global Inquirer, and the WordNet Affect Lexicon. From this, human annotation was done through Amazon's Mechanical Turk and thus, the NRC Lexicon was validated and distributed.
After close inspection of the NRC emotion lexicon, a large number of troubling entries were identified, where words that should in most contexts be emotionally neutral, with no affect (e.g., lesbian, stone, mountain), are associated with emotional labels that are inaccurate, nonsensical, pejorative, or, at best, highly contingent and context-dependent (e.g. lesbian labeled as DISGUST and SADNESS, stone as ANGER, or mountain as ANTICIPATION).
The revised NRC consists of 5,916 entries that result from the works referenced in Zad et al. (2021) "Hell Hath No Fury? Correcting Bias in the NRC Emotion Lexicon", published at the WOAH, the 5th Workshop on Online Abuse and Harms.
The root of the main archive contains a Readme file which explains the archive contents. The Java code inside the archive can be imported directly into the Eclipse IDE as a project encapsulating which can be used to reproduce the results of the paper; the code compiles with Java 1.8. The main archive also contains the final camera ready copy of the paper for reference. The code and data are released under the terms of CC-BY-NC 4.0. (2021-08-05).

Version 1.0
Distributor ELRA