News Release 19-Jun-2024

USC at the NAACL ’24 Conference

Notable research includes work on legal AI, LLM manipulation by bad actors, and assessing argument qualit

University of Southern California

At the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), held June 16-21, 2024 in Mexico City, Mexico, researchers from the USC Viterbi School of Engineering are presenting work across a number of research tracks including Ethics, Bias, and Fairness; Discourse and Pragmatics; Multilinguality and Language Diversity; and more.

Run by NAACL, which provides a regional focus for members of the Association for Computational Linguistics (ACL) in North, Central and South America, the annual conference is one of the premier conferences for natural language research. NAACL 2024 accepted 565 out of 2434 submissions, for an acceptance rate of 23.2%

Research Spotlights 2024

Can generated language lead to healthier online conversations?
The most common approaches to moderation of online conversations have been deleting comments or banning users. These methods are simple and easy to scale, but they are also harsh measures that may push the affected users towards echo chambers, exacerbating societal polarization. An alternative approach is conversational moderation, where moderators provide contextualized feedback by conversing with the users over multiple turns to address their problematic behaviors. However, as one may expect, it’s time-consuming and taxing for the moderators.

Justin Cho, a research assistant and PhD student at USC’s Information Sciences Institute (ISI) said, “My collaborators and I wanted to see whether we could use language models to generate response recommendations to relieve the burden of human moderators in doing conversational moderation.” In their paper, Can Language Model Moderators Improve the Health of Online Discourse?, Cho and his co-authors examine the role of conversational moderation in online communities and propose using advanced language generation technology to assist human moderators. They define moderation effectiveness, offer a framework for evaluation, and find that while language models can identify toxic behavior, they struggle to influence users positively.

Using an age-old study hack to help models remember information learned in different languages
Continual learning is a method of training machine learning models incrementally, using data samples only once as they arrive. Cross-lingual continual learning (CCL) uses emerging data from new languages to do this. However, a big problem with CCL is the plasticity-stability dilemma: the model forgets what it previously learned when it encounters new languages. After experiencing this first-hand, ISI’s Meryem M’hamdi looked at how humans retain information, “I contemplated cognitive strategies conducive to long-term retention when revisiting information. This led me to delve into spaced repetition, a method wherein learners strategically review previously learned concepts before the learner might forget them.”

In their paper, Leitner-Guided Memory Replay for Cross-lingual Continual Learning, M’hamdi and co-author Jonathan May, a research associate professor at the Thomas Lord Department of Computer Science, used a method called Leitner queuing to decide which old data the model should revisit during training. Leitner queuing is a structured approach to learning that ensures easier concepts are revisited less frequently than more challenging ones. The team repurposed that as a method for determining informative examples to repeat and spurious data not to prioritize. “Our strategy alternates between learning new difficult examples along with reinforcement of easier ones,” said M’Hamdi. Their experiments show that this method helps reduce forgetting and maintain accuracy across different languages and tasks compared to other approaches.

Decoding legal texts with AI
Legal AI has made significant strides in recent years but still struggles with basic legal concepts like when a law applies, who it affects, and what it does. Addressing these challenges, ISI researchers introduced a new method called span-and-relation parsing and created a dataset named LegalDiscourse to enhance AI’s understanding of legal texts.

In their paper, LegalDiscourse: Interpreting When Laws Apply and To Whom, the team demonstrates the practical application of their schema by creating a web application for journalists. They collected over 100,000 laws from 52 U.S. states and territories and applied their trained models to 6,000 laws using U.S. Census population data. This led to journalistic investigations into trends such as the increase in liquor licenses following population growth and the decrease in applicable laws under different under-count projections.

Hey, Robot, bring me my favorite mug
In “Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding,” the researchers explore how training a machine learning model to represent 3D volume and shape can help it correctly identify objects based on descriptive language. For example, distinguishing “the mug with the thin handle” from two similar mugs that only differ by handle thickness. While this is mainly an academic exercise, accurately linking language to objects using visual information is crucial for more autonomous robots, which often lack sophisticated language capabilities in their current deployments.

An LLM never forgets?
Large language models (LLMs) like ChatGPT can memorize long web text sequences, leading to potential copyright infringements. For instance, the New York Times recently sued Open AI for training models on copyrighted articles, with ChatGPT transcripts indicating the model memorized paragraphs of the newspaper articles verbatim.

The paper “Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks” investigates how models memorize information. According to the researchers, two new benchmarks successfully pinpointed memorized data to specific “neurons” in the model. “This gives us hope that future work can rely on these localization methods, for instance, to make a model ‘forget’ something that it had previously memorized,” said co-author Robin Jia. Causing models to forget information could also ensure models do not retain certain sensitive information.

Is the devil in the details?
Over-specification in language models refers to the inclusion of excessive, unnecessary, or overly detailed information in the training data. This can lead to several issues, particularly in the model’s ability to generalize from the training data to new, unseen examples.

“On Retrieval Augmentation and the Limitations of Language Model Training” analyzes how a technique called kNN augmentation improves language models’ performance. The researchers created a synthetic dataset to show that over-specification in training data prevents standard language models from generalizing in over-specified scenarios, while kNN-augmented LMs perform better.

Shall I compare thee?
Humans learn through comparisons, analogies, and metaphors. For instance, “steel is stronger and heavier than Styrofoam” is an essential component of our world knowledge. But how good is AI at processing comparative knowledge? “NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge” finds that large language models are excellent comparators but sometimes make important mistakes. The team’s algorithms help filter out many of these mistakes. “Somewhat surprisingly, we find that our algorithms can also help smaller language models elicit comparative knowledge effectively, without any additional training,” said co-author Swabha Swayamdipta. “Our work produces a large collection of comparative statements, NeuroComparatives, which we show are useful for commonsense reasoning tasks for AI.”

AI model: meet infographics
“Our goal is to build AI models that can accurately and efficiently read infographics, which contain a mixture of text, numbers, tables, plots, and other visual cues,” said Jesse Thomason, co-author of paper titled “Efficient End-to-End Visual Document Understanding with Rationale Distillation.”

To make the system efficient, the researchers set out to use a small neural network, versus a larger neural network like GPT-4, and to avoid relying on expensive tools like optical character recognition (OCR). The team successfully created a new method that can train a small neural network to process infographics accurately, without other expensive tools. This could help anyone who wants to use AI tools to automate the processing of documents that include not only text but also visual components.

Complete list of accepted USC papers below:
Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection
Jun Yan, Vikas Yadav, Shiyang Li, Lichang Chen, Zheng Tang, Hai Wang, Vijay Srinivasan, Xiang Ren, Hongxia Jin

Can Language Model Moderators Improve the Health of Online Discourse?
Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, Yuyang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrara, Jonathan May

Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks
Negar Mokhberian, Myrl G. Marmarelis, Frederic R. Hopp, Valerio Basile, Fred Morstatter, Kristina Lerman

Contextualizing Argument Quality Assessment with Relevant Knowledge
Darshan Deshpande, Zhivar Sourati, Filip Ilievski, Fred Morstatter

Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks
Ting-Yun Chang, Jesse Thomason, Robin Jia

Efficient End-to-End Visual Document Understanding with Rationale Distillation
Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

Instruction-following Evaluation through Verbalizer Manipulation
Shiyang Li, Jun Yan, Hai Wang, Zheng Tang, Xiang Ren, Vijay Srinivasan, Hongxia Jin

LegalDiscourse: Interpreting When Laws Apply and To Whom
Alexander Spangher, Zihan Xue, Te-Lin Wu, Mark Hansen, Jonathan May

Leitner-Guided Memory Replay for Cross-lingual Continual Learning
Meryem M’hamdi, Jonathan May

LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages
Jared R. Coleman, Bhaskar Krishnamachari, Ruben Rosales and Khalil Iskarous

NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
Phillip Howard, Junlin Wang, Vasudev Lal, Gadi Singer, Yejin Choi, Swabha Swayamdipta

On Retrieval Augmentation and the Limitations of Language Model Training
Ting-Rui Chiang, Xinyan Velocity Yu, Joshua Robinson, Ollie Liu, Isabelle Lee, Dani Yogatama

Reinforced Multiple Instance Selection for Speaker Attribute Prediction
Alireza Salkhordeh Ziabari, Ali Omrani, Parsa Hejabi, Preni Golazizian, Brendan Kennedy, Payam Piray, Morteza Dehghani

Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding
Chancharik Mitra, Abrar Anwar, Rodolfo Corona, Dan Klein, Trevor Darrell, Jesse Thomason

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.