Detecting Sentiment and Emotion toward Disasters in Low Resource Languages
ABSTRACT When disaster occurs, online posts in text and video, phone messages, and even newscasts expressing distress, fear, and anger toward the disaster itself or toward those who might address the consequences of the disaster such as local and national governments or foreign aid workers represent an important source of information about where the most urgent issues are occurring and what these issues are. However, these information sources are often difficult to triage, due to their volume and lack of specificity. They represent a special challenge for aid efforts by those who do not speak the language of those who need help – especially when bilingual informants are few and when the language of those in distress is one with few computational resources. We are working in a large DARPA effort which is attempting to develop tools and techniques to support the efforts of such aid workers very quickly, by leveraging methods and resources which have already been collected for use with other, High Resource Languages. Our particular goal is to develop methods to identify sentiment and emotion in spoken language for Low Resource Languages.
Our effort to date involves two basic approaches: 1) training classifiers to detect sentiment and emotion in High Resources Languages such as English and Mandarin which have relatively large amounts of data labeled with emotions such as anger, fear, and stress and using these directly of adapted with a small amount of labeled data in the LRL of interest, and 2) employing a sentiment detection system trained on HRL text and adapted to the LRL using a bilingual lexicon to label transcripts of LRL speech. These labels are then used as labels for the aligned speech to use in training a speech classifier for positive/negative sentiment. We will describe experiments using both such approaches, as well as experiments classifying news broadcasts that contain information about disasters.
BIO Julia Hirschberg is the Percy K. and Vida L. W. Hudson Professor and Chair of Computer Science at Columbia University. She previously worked at Bell Laboratories and AT&T Labs where she created the HCI Research Department. She served on the Association for Computational Linguistics executive board (1993-2003), the International Speech Communication Association board (1999-2007; 2005-7 as president), and the International Conference on Spoken Language Processing board since 1996. She has been editor of Computational Linguistics and Speech Communication, is a fellow of AAAI, ISCA, ACL, ACM, IEEE, and a member of the National Academy of Engineering. She has received the IEEE James L. Flanagan Speech and Audio Processing Award and the ISCA Medal for Scientific Achievement. She currently the serves on the IEEE Speech and Language Processing Technical Committee, is co-chair of CRA-W Board, and has worked for diversity for many years at AT&T and Columbia. She works on spoken language processing and NLP, studying text-to-speech synthesis, spoken dialogue systems, entrainment in conversation, detection of deceptive and emotional speech, hedging behavior, and linguistic code-switching (language mixing).
FACULTY HOST Carolyn Rose