image: Several prominent foundation models are employed to enhance our understanding of high-throughput biological data, followed by a discussion on the application of prediction and generation models across various downstream tasks in the field of bioinformatics.
Credit: ©Science China Press
This study is led by Prof. Wang (School of Computer Science and Engineering, Central South Univerisity). The research team has identified recent advancements in bioinformatics foundation models (FMs) that are applied across a range of downstream tasks, including genomics, transcriptomics, proteomics, drug discovery, and single-cell analysis. Their objective is to assist scientists in selecting suitable FMs for bioinformatics based on four model categories: language FMs, vision FMs, graph FMs, and multimodal FMs. Beyond enhancing our understanding of molecular landscapes, AI technology can provide both theoretical and practical foundations for ongoing innovation in the field of molecular biology.
Lab Director Jianxin Wang provided an analysis of bioinformatics FMs that can be trained using both supervised and unsupervised learning techniques for applications addressing fundamental biological challenges as well as integrated biological issues. They highlighted recent advancements in bioinformatics foundation models, emphasizing their versatility as essential tools in the field.
The team provided a comprehensive summary of several prominent foundation models utilized to enhance the understanding of high-throughput biological data. This was followed by an in-depth discussion on the application of prediction and generation models across various downstream tasks within bioinformatics. Their discourse emphasized key aspects such as biological databases, training strategies, hyperparameter configurations, and relevant biological applications.
They possess a comprehensive understanding of how the revised model effectively addresses the limitations and shortcomings of the primary model by elucidating the evolutionary process of bioinformatics feature mapping. “Taking advantage of the latest bioinformatics FM, one can achieve unprecedented accuracy, realize an integrated AI model, and perform richer downstream analysis.” Prof. Wang says.
“Taking the classic biological problem ‘protein three-dimensional structure reconstruction’ as a representative demonstration, DeepMind has developed three iterations of an artificial intelligence system over the past five years.”, Prof. Wang says.
The researchers articulated their insights regarding the promising trajectory of bioinformatics FMs. They drew upon their experiences with model pre-training frameworks, selection of benchmarking methods, white-box approaches and interpretability, as well as the evaluation of model hallucinations.
See the article:
Foundation Models in Bioinformatics
https://doi.org/10.1093/nsr/nwaf028
Journal
National Science Review