Article Highlight | 7-Sep-2023

Paradigm shift in natural language processing

Beijing Zhongke Journal Publising Co. Ltd.

In the scope of this paper, a paradigm is a general modeling framework or a distinct set of methodologies to solve a class of tasks. For instance, sequence labeling is a mainstream paradigm for named entity recognition (NER). Different paradigms usually require different formats of input and output, and therefore highly depend on the annotation of the tasks. In the past years, modeling for most NLP tasks has converged to several mainstream paradigms, as summarized in this paper,

Class, Matching, SeqLab, MRC, Seq2Seq, Seq2ASeq, and (M)LM.

 

Though the paradigm for many tasks has converged and dominated for a long time, recent work has shown that models under some paradigms also generalize well on tasks with other paradigms. For example, the MRC and Seq2Seq paradigms can also achieve state-of-the-art performance on NER tasks, which were previously formalized in the sequence labeling (SeqLab) paradigm. Such methods typically first convert the form of the dataset to the form required by the new paradigm, and then use the model under the new paradigm to solve the task. In recent years, similar methods that reformulate a natural language processing (NLP) task as another one has achieved great success and gained increasing attention in the community. After the emergence of pre-trained language models (PTMs), paradigm shifts have been observed in an increasing number of tasks. Combined with the power of these PTMs, some paradigms have shown great potential to unify diverse NLP tasks. One of these potential unified paradigms, (M)LM (also referred to as prompt-based tuning), has made rapid progress recently, making it possible to employ a single PTM as the universal solver for various understanding and generation tasks.

 

Despite their success, these paradigm shifts scattering in various NLP tasks have not been systematically reviewed and analyzed. In this paper, researchers attempt to summarize recent advances and trends in this line of research, namely paradigm shift or paradigm transfer.

 

This paper is organized as follows. Section 2 gives formal definitions of the seven paradigms, and introduces their representative tasks and instance models. Section 3 shows recent paradigm shifts that happened in different NLP tasks. Section 4 discusses the designs and challenges of several highlighted paradigms that have great potential to unify most existing NLP tasks. Section 5 concludes with a brief discussion of recent trends and future directions.

 

Section 2 briefly introduces the following seven paradigms that are widely used in NLP tasks and their corresponding tasks and models. The seven paradigms are Class, Matching, SeqLab, MRC, Seq2Seq, Seq2ASeq, and (M)LM. These paradigms have demonstrated strong dominance in many mainstream NLP tasks.

 

In section 3, researchers review the paradigm shifts that occur in different NLP tasks: text classification, natural language inference, named entity recognition, aspect-based sentiment analysis, relation exaction, text summarization, and parsing. Researchers also propose the trends of paradigm shift. They find that: 1) The frequency of paradigm shifts has been increasing in recent years, especially after the emergence of pre-trained language models (PTMs). Therefore, to fully utilize the power of these PTMs, a better way is to reformulate various NLP tasks into the paradigms that PTMs are good at. 2) More and more NLP tasks have shifted from traditional paradigms to paradigms that are more general and flexible.

 

Section 4 discusses the following general paradigms that have the potential to unify diverse NLP

tasks: (M)LM, Matching, MRC, and Seq2Seq. Part one includes prompt, verbalizer, parameter-efficient prompt tuning in (M)LM; Part two includes domain adaptation, label descriptions, comparison with prompt-based learning in Matching; Part three consists of the brief introduction of MRC and comparison with prompt-based learning; The last part consists of brief introduction of Seq2Seq and comparison with other paradigms.

 

Section 5 is the conclusion of this paper. Recently, prompt-based tuning, which is to formulate some NLP tasks into an (M)LM task, has exploded in popularity. They can achieve considerable performance with much less training data. In contrast, other potential unified paradigms, i.e., Matching, MRC, and Seq2Seq, are under-explored in the context of pre-training. One of the main reasons is that these paradigms require large-scale annotated data to conduct pre-training, especially Seq2Seq is notorious for being data-hungry. Nevertheless, these paradigms have their advantages over (M)LM: Matching requires less engineering, MRC is more interpretable, and Seq2Seq is more flexible to handle complicated tasks. Besides, by combining with self-supervised pre-training, or further pre-training on annotated data with existing language model as initialization, these paradigms can achieve competitive performance or even better performance than (M)LM. Therefore, researchers argue that more attention is needed for the exploration of more powerful entailment, MRC, or Seq2Seq models through pre-training or other alternative techniques.

 

See the article:

Paradigm Shift in Natural Language Processing

http://doi.org/10.1007/s11633-022-1331-6

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.