image: Non-iterative Crystal Structure Prediction with ShotgunCSP
Credit: @The Institute of Statistical Mathematics
Overview
A research team from the Institute of Statistical Mathematics and Panasonic Holdings Corporation has developed a machine learning algorithm, ShotgunCSP, that enables fast and accurate prediction of crystal structures from material compositions. The algorithm achieved world-leading performance in crystal structure prediction benchmarks.
Crystal structure prediction seeks to identify the stable or metastable crystal structures for any given chemical compound adopt under specific conditions. Traditionally, this process relies on iterative energy evaluations using time-consuming first-principles calculations and solving energy minimization problems to find stable atomic configurations. This challenge has been a cornerstone of materials science since the early 20th century. Recently, advancements in computational technology and generative AI have enabled new approaches in this field. However, for large-scale or complex molecular systems, the exhaustive exploration of vast phase spaces demands enormous computational resources, making it an unresolved issue in materials science.
The team discovered that leveraging machine learning algorithms allows for highly accurate predictions of the symmetry patterns inherent in stable crystal structures. By employing these predictors to drastically reduce the search space, they eliminated the need for iterative first-principles calculations. This simplified approach demonstrated that even for large and complex systems, stable structures could be predicted with remarkably high accuracy and efficiency.
This groundbreaking achievement was published in npj Computational Materials on December 20, 2024.
Research Outcomes
Crystals are solids formed by atoms or molecules arranged periodically and are used in semiconductors, pharmaceuticals, batteries, and many other applications. The structure of a crystal has a significant impact on the material's properties. In the process of material development, the synthesis of materials requires considerable time and effort, making techniques for predicting crystal structures in advance extremely important. Predicting energetically stable or metastable crystal structures from chemical compositions has been a longstanding challenge in materials science. In principle, crystal structures can be determined by solving energy minimization problems within the atomic configuration space, with energy evaluations typically performed using first-principles calculations based on density functional theory.
Crystal structure prediction (CSP) is typically addressed by combining first-principles calculations with optimization algorithms. For example, genetic algorithms are often employed to iteratively modify atomic configurations along energy gradients in the search for global or local minima on the energy landscape. However, these conventional approaches require iteratively relaxing a large number of candidate structures through first-principles calculations at each step, resulting in exceptionally high computational costs. This limitation becomes particularly severe for large-scale systems containing 30–40 or more atoms per unit cell, where existing methods face significant difficulties in accurately resolving crystal structures. Recent benchmark studies have revealed that current CSP algorithms can predict only less than 50% of all crystal systems 1, 2) , highlighting significant limitations in their performance.
The research team focused on developing a non-iterative CSP algorithm that eliminates the need for repeated first-principles calculations (Figure 1 ). First, they constructed an energy predictor using machine learning to approximate the energy calculation of first-principles calculations. By applying transfer learning, they found that a highly accurate energy predictor could be built with only a small number of training data. Next, they used a newly developed crystal structure generator to create promising virtual crystal structures. The energy predictor was then used to narrow down the candidates most likely to lead to stable structures. Finally, they applied first-principles calculations to relax the energies of the selected candidates and predicted the stable structure based on the crystal structure that reached the lowest energy. This algorithm was named ShotgunCSP, inspired by the image of a shotgun spreading across a wide area and carefully analyzing only the hits.
A key component of ShotgunCSP is the crystal structure generator. Because the structural space of large-scale systems is vast, efficiently narrowing the search space is crucial. The team discovered that machine learning could be used to predict the symmetry of the stable structure for any given composition (such as space groups and Wyckoff positions) with exceptionally high accuracy. This breakthrough enabled the efficient reduction of the search space, significantly lowering computational costs while maintaining high-precision predictions.
Space groups are mathematical frameworks that characterize the symmetry of crystals, representing a set of geometric operations (such as translation, rotation, inversion, and reflection) that map the atomic arrangement in a crystal lattice to its original positions. All crystals are classified into 230 distinct space groups. The research team demonstrated that, by using a model trained on a crystal structure database, they could narrow down the possible space groups for stable structures to the top 30 or so, enabling nearly complete identification of the space group for any given composition.
Wyckoff positions describe the degree of freedom for atomic configurations that is allowed under the symmetry operations of a specific space group. Each atom is assigned a Wyckoff label, and the positions of atoms displaced according to the corresponding rules preserve the original symmetry. The team showed that by leveraging machine learning, they could efficiently narrow down the assignment of Wyckoff labels for each atom in any given composition.
By utilizing these symmetry predictors, the search space for crystal systems can be dramatically reduced, leading to a significant improvement in the accuracy of CSP. According to large-scale performance evaluations conducted in this study, ShotgunCSP is capable of accurately predicting approximately 80% of all crystal systems. Its performance far exceeds that of the elemental-substitution-based CSP algorithm, CSPML2) , which was previously developed by the team and held the top rank in recent benchmarks1) .
Future Outlook
CSP algorithms are foundational technologies that accelerate the development of new materials and scientific discoveries. By identifying the stable structures of materials, significant advancements can be made in exploring high-temperature superconductors, battery materials, catalysts, thermoelectric materials, pharmaceutical molecules, and even material structures under extreme conditions such as high temperature and pressure. The research team succeeded in significantly improving the prediction performance of CSP algorithms by discovering a novel approach, distinct from traditional methods, in which machine learning is used to narrow down the crystal symmetry of stable phases. Additionally, ShotgunCSP, with its simple algorithmic design, possesses high compatibility with parallel computing, and further performance improvements are expected as the computations are scaled up.
Acknowledgments
This work was partially supported by the Japan Society for the Promotion of Science (JSPS: 19H05820,19H01132,23K16955) and the Japan Science and Technology agency (JST: JPMJCR19I3,JPMJCR22O3,JPMJCR2332).
References
1) Wei et al., CSPBench: a benchmark and critical evaluation of crystal structure prediction. arXiv preprint. arXiv:2407.00733 (2024). DOI 10.48550/arXiv.2407.00733
2) Kusaba et al., Crystal structure prediction with machine learning-based element substitution. Computational Materials Science 211, 111496 (2022). DOI: 10.1016/j.commatsci.2022.111496 .
###
About The Institute of Statistical Mathematics (ISM)
The Institute of Statistical Mathematics (ISM) is part of Japan's Research Organization of Information and Systems (ROIS). With more than 80 years of history, the institute is an internationally renowned facility for research on statistical mathematics including comprehensive evaluation of earthquake data in Japan and other parts of the world. ISM comprises three different departments including the Department of Statistical Modeling, the Department of Statistical Data, and the Department of Statistical Inference and Mathematics, as well as several key data and research centers. Through the efforts of various research departments and centers, ISM aims to continuously facilitate cutting edge research collaboration with universities, research institutions, and industries both in Japan and other countries.
About the Research Organization of Information and Systems (ROIS)
ROIS is a parent organization of four national institutes (National Institute of Polar Research, National Institute of Informatics, the Institute of Statistical Mathematics and National Institute of Genetics) and the Joint Support-Center for Data Science Research. It is ROIS's mission to promote integrated, cutting-edge research that goes beyond the barriers of these institutions, in addition to facilitating their research activities, as members of inter-university research institutes.
Journal
npj Computational Materials
Article Title
Shotgun crystal structure prediction using machine-learned formation energies
Article Publication Date
20-Dec-2024