image: Post-LLM roadmap.
Credit: Fei Wu et al.
A recent paper published in the journal Engineering delves into the future of artificial intelligence (AI) beyond large language models (LLMs). LLMs have made remarkable progress in multimodal tasks, yet they face limitations such as outdated information, hallucinations, inefficiency, and a lack of interpretability. To address these issues, researchers explore three key directions: knowledge empowerment, model collaboration, and model co-evolution.
Knowledge empowerment aims to integrate external knowledge into LLMs. This can be achieved through various methods, including integrating knowledge into training objectives, instruction tuning, retrieval-augmented inference, and knowledge prompting. For example, some studies design knowledge-aware loss functions during pre-training, while others use retrieval-augmented generation to dynamically fetch relevant knowledge during inference. These techniques enhance LLMs’ factual accuracy, reasoning capabilities, and interpretability.
Model collaboration focuses on leveraging the complementary strengths of different models. It includes strategies like model merging and collaboration based on different functional models. Model merging, such as model ensembling and model fusion (e.g., the mixture of experts), combines multiple models to improve performance. In functional model collaboration, LLMs can act as task managers, coordinating specialized small models. For instance, in image-generation tasks, LLMs can guide specialized models to better meet the requirements of prompts.
Model co-evolution enables multiple models to evolve together. Under different types of heterogeneity-model, task, and data-various techniques have been proposed. For model heterogeneity, methods like parameter sharing, dual knowledge distillation, and hypernetwork-based parameter projection are used. In the face of task heterogeneity, dual learning, adversarial learning, and model merging play important roles. When dealing with data heterogeneity, federated learning and out-of-distribution knowledge distillation are key techniques. These methods enhance models’ adaptability and ability to handle diverse tasks.
The post-LLM advancements have far-reaching impacts. In science, they help in hypothesis development by incorporating domain-specific knowledge. For example, in meteorology, AI models integrated with domain knowledge can improve renewable energy forecasting. In engineering, they assist in problem formulation and solving. In society, they can be applied in areas like healthcare and traffic management.
Looking ahead, the paper also points out several future research directions, including embodied AI, brain-like AI, non-transformer foundation models, and LLM-involved model generation. These areas hold great potential for further advancing AI capabilities. As AI continues to evolve, the integration of knowledge, collaboration, and co-evolution will be crucial in building more robust, efficient, and intelligent AI systems.
The paper “Knowledge-Empowered, Collaborative, and Co-Evolving AI Models: The Post-LLM Roadmap,” authored by Fei Wu, Tao Shen, Thomas Bäck, Jingyuan Chen, Gang Huang, Yaochu Jin, Kun Kuang, Mengze Li, Cewu Lu, Jiaxu Miao, Yongwei Wang, Ying Wei, Fan Wu, Junchi Yan, Hongxia Yang, Yi Yang, Shengyu Zhang, Zhou Zhao, Yueting Zhuang, Yunhe Pan. Full text of the open access paper: https://doi.org/10.1016/j.eng.2024.12.008. For more information about the Engineering, follow us on X (https://twitter.com/EngineeringJrnl) & like us on Facebook (https://www.facebook.com/EngineeringJrnl).
Journal
Engineering
Article Title
Knowledge-Empowered, Collaborative, and Co-Evolving AI Models: The Post-LLM Roadmap
Article Publication Date
19-Dec-2024