The National Institute of Informatics (NII, Director-General: KUROHASHI Sadao, Chiyoda-ku, Tokyo, Japan) has established the Research and Development Center for Large Language Models (LLMC, Director: NII Director-General KUROHASHI Sadao) on April 1, 2024. The LLMC conducts research and development of large language models (LLMs).
In May 2023, NII set up the LLM Research Group (LLM-jp) which includes a wide range of people from domestic research institutes, private enterprises, and other organizations. Since then, NII keeps advancing the research and development of open generative AI. Recently, a new center has been established within NII for implementing the "R&D Hub Aimed at Ensuring Transparency and Reliability of Generative AI Models" project of the Ministry of Education, Culture, Sports, Science and Technology. Now NII is structurally ready to help up-and-coming AI researchers concentrate on the research and development of LLMs.
LLMC is first developing a LLM with 175 billion parameters which is equivalent to GPT-3 level with the goal of completing around the summer of 2024. We are also promoting research activities related to the development of large language models that are open and Japanese-Proficient, including advanced R&D activities to ensure the transparency and reliability of LLMs. And through these activities, we will accumulate a series of knowledge and experience that will contribute to the evolution of AI and, ultimately, to the creation of revolutionary innovations for the future.
Large language models (LLMs) have been increasingly used in all industries. They have the potential to change existing industrial bases drastically as foundation models and are also expected to function as a knowledge base that is indispensable for extensive science and technology research. Now, however, corpus data of major LLMs are kept private and not opened to the public. The models and the behavior of the models are a black box. Consequently, there are still a lot of problems such as hallucination and biases. The data used in the training of the major LLMs are focused on English and their ability to understand and generate Japanese language content is relatively low. In Japan, there are few development cases of LLM with the size of 100 billion parameters. And the lack of research examples leads to the delay of the acquisition of knowledge in the area of LLM development.
The LLM Research Group (LLM-jp) having been led by NII developed and released its first model of LLM with 13 billion parameters in October 2023 and the release has contributed to LLM development in Japan. Now, NII has established LLMC as a base for implementing the " R&D Hub Aimed at Ensuring Transparency and Reliability of Generative AI Models" project of the Ministry of Education, Culture, Sports, Science and Technology, to set up a system to promote R&D to ensure transparency and reliability of generative AI models (Fig.1). Based on the knowledge about LLM development obtained through LLM-jp, we are building a knowledge hub where researchers and engineers can cooperate and are also creating an environment to nurture R&D capabilities related to generative AI models.
LLMC conducts the following R&D activities.
1. Build LLMs for R&D
LLMC makes LLMs fully open to researchers as well as prepares corpus data, computing environments, and assessment benchmarks for R&D.
2. Ensure the transparency and reliability of LLMs
LLMC ensures the transparency and reliability of generative AI by elucidating generative AI behavioral principles and developing technologies to control the impact of data alteration, data bias, etc.
3. Make LLMs highly sophisticated
LLMC ensures that R&D activities, such as domain adaptation and making models lighter, aid the development of generative AI models.
We keep advancing the R&D activities of LLM-jp. We are planning to release of 175 billion parameters scale LLM (equivalent to GPT-3) around this summer. By utilizing our research results in experimental LLM models, we seek to establish a method for creating transparent and reliable generative AI models. At the same time, we accumulate knowledge and experience as it contributes to the evolution of AI and innovation toward the future.
[Overview of the Center]
Name: Research and Development Center for Large Language Models
Director: KUROHASHI Sadao (Director-General, National Institute of Informatics / Program-Specific Professor, Kyoto University)
Vice Director: AIZAWA Akiko (Vice Director-General, National Institute of Informatics / Professor, Digital Content and Media Sciences Research Division)
TAKEDA Koichi (Project Professor, National Institute of Informatics)
Many researchers from universities, private companies and other institutions will join in our R&D activities through LLM-jp Research Group. We will continue to enhance our research organization. If you want to know more about us, please visit our website.
Link: https://www.nii.ac.jp/research/centers/llmc/
Comment from KUROHASHI Sadao, LLMC Director
“In May 2023, the National Institute of Informatics established the LLM Research Group (LLM-jp) in which anyone who agrees with our philosophy can join. We’ve also disclosed all of our model’s mechanisms, development data, tools, technical documents and other materials, including the development processes, discussions and even failures. Thanks to the participation from various universities and companies, the number of participants in the LLM-jp has exceeded 1000 people. The recognition of our activities has led to the establishment of LLMC. We will prepare the necessary computational resources and strive to elucidate the principles of generative AI and establish methods of developing LLM. We hope our new research center will become the place where talented and energetic young researchers can join and the hub of research and development of LLMs in Japan. We also want to build an international cooperation system of researching open LLMs.”
###
About the National Institute of Informatics (NII)
NII is Japan's only academic research institute dedicated to the new discipline of informatics. Its mission is to "create future value" in informatics. NII conducts both long-term basic research and practical research aimed at solving social problems in a wide range of informatics research fields, from fundamental theories to the latest topics, such as artificial intelligence, big data, the Internet of Things, and information security.
As an inter-university research institute, NII builds and operates academic information infrastructure essential for the research and educational activities of the entire academic community (including the Science Information Network) as well as developing services such as those that enable the provision of academic content and service platforms.
https://www.nii.ac.jp/en/
About the Research Organization of Information and Systems (ROIS)
ROIS is a parent organization of four national institutes (National Institute of Polar Research, National Institute of Informatics, the Institute of Statistical Mathematics and National Institute of Genetics) and the Joint Support-Center for Data Science Research. It is ROIS's mission to promote integrated, cutting-edge research that goes beyond the barriers of these institutions, in addition to facilitating their research activities, as members of inter-university research institutes.