News Release

Researchers Achieve One Teraflop Performance With Supercomputer Simulation Of Magnetism

Peer-Reviewed Publication

DOE/Lawrence Berkeley National Laboratory

BERKELEY, CA -- A team of scientists from two national laboratories reached a supercomputing milestone this weekend, getting their simulation of metallic magnetism to run at 1.002 Teraflops -- more than one trillion calculations per second.

The achievement, reached using a 1,480-processor Cray T3E supercomputer at the manufacturer's facility in Minnesota, caps an already remarkable scaling up of the code to run on increasingly powerful massively parallel supercomputers. Over the summer, the team of scientists at Oak Ridge National Laboratory working with the National Energy Research Scientific Computing Center (NERSC) at the Lawrence Berkeley National Laboratory performed a 1,024-atom first-principles simulation of metallic magnetism in iron which ran at 657 Gigaflops (billions of calculations per second) on a 1024-processor Cray/SGI T3E supercomputer. This success made them finalists for the Gordon Bell Prize, awarded annually to honor the best achievement in high-performance computing. The team, which also includes collaborators at the Pittsburgh Supercomputing Center and the University of Bristol (UK), are finalists for the prize for their parallel computer simulation of metallic magnetism.

Funded as one of the U.S. Department of Energy's Grand Challenges, the group developed the computer code to provide a better microscopic understanding of metallic magnetism, which has applications in fields ranging from computer data storage to power generation and utilization.

Given annually at SC98, the annual conference of high-performance computing and networking, the Gordon Bell Prize recognizes the best accomplishment in high-performance computing. The Oak Ridge-NERSC group was nominated in the category for highest computer speed using a real-world application. The winner of this year's prize will be announced during the conference on Thursday, Nov. 12, in Orlando, Fla.

Although parallel supercomputers are the world's fastest computers -- capable of performing hundreds of billions of calculations per second -- realizing their potential often requires writing complex computer codes as well as reformulating the scientific approach to problems so that the codes scale up efficiently on these types of machines.

In developing this code for parallel computers the researchers were forced to rethink their formulation of the basic physical phenomena. The code was originally developed with Intel Paragon machines at ORNL's Center for Computational Science (CCS) in mind and has exhibited linear scale up to 1024-processors on an Intel XPS-150.

"One of the goals of this project is to address critical materials problems on the microstructural scale to better understand the properties of real materials. A major focus of our research is to establish the relationship between technical magnetic properties and microstructure based on fundamental physical principles," said Malcolm Stocks, a scientist in Oak Ridge's Metals and Ceramics Division and leader of the project. "The capability to design magnetic materials with specific and well-defined properties is an essential component of the nation's technological future."

In May and June of this year, the research team ran successively larger calculations on a series of bigger and more powerful Cray supercomputers. After the simulation code attained a speed of 276 Gflops on the Cray T3E-900 512-processor supercomputer at NERSC, the group arranged for use of an even faster T3E-1200 at Cray Research Inc. and achieved 329 Gflops. They were then given dedicated time on a T3E600 1024-processor machine at the NASA Goddard Space Flight Center which allowed them to perform crucial code development work and testing before the final run at 657 Gflops on a T3E1200 1024-processor machine at a U.S. government site.

"These increases in the performance levels demonstrate both the power and the capabilities of parallel computers -- a code can be scaled up so that it not only runs faster but allows us to study larger systems and new phenomena that cannot be studied on smaller machines," said Andrew Canning, a physicist in NERSC's Scientific Computing Group who worked with the Oak Ridge team on this project.

The Gordon Bell Award work was part of a larger Department of Energy Grand Challenge Project on Materials, Methods, Microstructure and Magnetism between ORNL, Ames Laboratory (Iowa), Brookhaven National Laboratory, NERSC and the Center for Computational Science and the Computer Science and Mathematics Divisions at ORNL.

"As the Department of Energy's national facility for computational science, we see this achievement by the Grand Challenge team as a major breakthrough in high-performance computing," said NERSC Division Director Horst Simon. "Unlike other recently published records, this is a real application running on an operational production machine and delivering real scientific results. NERSC is proud to have been a partner in this effort."

NERSC (www.nersc.gov) provides high performance computing services to DOE's Energy Research programs at national laboratories, universities, and industry. Berkeley Lab (www.lbl.gov) conducts unclassified research and is managed by the University of California.

SCIENTIFIC BACKGROUND

Developing a microscopic understanding of metallic magnets has proven to be an abiding scientific challenge. This originates in the itinerant nature of the electrons that give rise to the magnetic moment, which are the same electrons that give rise to metallic cohesion (bonding). It is this dual behavior of the electrons precludes the use of simple (Heisenberg) models.

The performance runs were performed during the development of a new theory of non-equilibrium states in magnets. The new constrained local moment (CLM) theory places a recent proposal for first principles Spin Dynamics (SD) from a group at Ames Laboratory on firm theoretical foundations. In SD non-equilibrium 'local moments' (for example, in magnets above the Curie temperature, or in the presence of an external field), evolve from one time step to the next according to a classical equation of motion. As originally formulated there were fundamental problems with SD. This stems from the fact that the instantaneous magnetization states that are being evolved were not properly defined within Local Spin Density Approximation to the Density Functional Theory (LSDA), the framework of most modern quantum simulations of materials. (Interestingly, this year's Nobel prize in Chemistry was awarded to Professor Walter Kohn for originating Density Functional Theory).

The CLM theory properly formulates SD within constrained density functional theory. Local constraining fields are introduced, the purpose of which is to force the local moments to point in directions required at a particular time step of SD. A general algorithm for finding the constraining fields has been developed. The existence of CLM states has been demonstrated by performing calculations for large (up to 1024 atom) unit cell disordered local moment models of Iron above its Curie temperature. In this model the magnetic moments associated with individual Fe atoms are constrained to point in a set of orientations that are chosen using a random number generator. This state can be thought of as being prototypical of the state of magnetic order at a particular step in a finite temperature SD simulation of paramagnetic Fe. These calculations represent significant progress towards the goal of full implementation of SD and a first principles theory of the finite temperature and non-equilibrium properties of magnetic materials.

The work was performed by: Balazs Ujfalussy, Xindong Wang, Xiaoguang Zhang, Donald M. C. Nicholson, William A. Shelton and G. Malcolm Stocks, Oak Ridge National Laboratory; Andrew Canning, NERSC, Lawrence Berkeley National Laboratory; Yang Wang, Pittsburgh Supercomputing Center; and B. L. Gyorffy, H. H. Wills Physics Laboratory, UK.

###



Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.