Article Highlight | 30-Sep-2024

𝒴-tuning: an efficient tuning paradigm for large-scale pre-trained models via label representation learning

Higher Education Press

Large-scale pre-trained models (PTMs) have been extensively utilized as backbone models for numerous natural language processing downstream tasks. Recently, various lightweight-tuning paradigms have emerged and achieved comparable performance with fine-tuning in a more parameter-efficient manner. Nonetheless, these works still necessitate the computation and storage of gradients, leading to high training expenses.
To address these issues, a research team led by Xipeng Qiu from Fudan University published their latest findings on 15 August 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

𝒴 -Tuning, for large-scale pre-trained language models. The tuning paradigm learns dense representations for labels 𝒴 defined in a given task and aligns them to fixed feature representation generated frozen pre-trained models. By avoiding the computation of text encoder’s gradients at training phrase, 𝒴-Tuning is not only parameter-efficient but also training-efficient.Illustration of 𝒴-Tuning and other tuning paradigms.
𝒴-Tuning achieves close performance with full fine-tuning while significantly increasing training speed. Experimental results demonstrate that for DeBERTaXXL which has 1.6 billion parameters, 𝒴-Tuning achieves performance over 96% of full fine-tuning performance on GLUE Benchmark with only 2% tunable parameters and substantially reduces training costs. Furthermore, 𝒴-Tuning exhibits better model-robustness than baselines, as the label matching mechanism employed in 𝒴-Tuning is less sensitive to feature perturbation.

Future research could explore enhancing tuning performance by incorporating label prior information and utilizing reparameterization techniques to accelerate inference. This would enable a more comprehensive understanding and application of the 𝒴-Tuning paradigm in various natural language processing tasks.
DOI: 10.1007/s11704-023-3131-8

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.