Feature Story | 12-Mar-2025

Summit Supercomputer Draws Molecular Blueprint for Repairing Damaged DNA

Researchers use the world’s most powerful supercomputers to simulate the inner workings of cellular machinery that restores DNA and prevents deadly diseases

DOE/Oak Ridge National Laboratory

Sunburns and aging skin are obvious effects of exposure to harmful UV rays, tobacco smoke and other carcinogens. But the effects aren’t just skin deep. Inside the body, DNA is literally being torn apart.

Understanding how the body heals and protects itself from DNA damage is vital for treating genetic disorders and life-threatening diseases such as cancer. But despite numerous studies and medical advances, much about the molecular mechanisms of DNA repair remains a mystery.

For the past several years, researchers at Georgia State University tapped into the Summit supercomputer at the Department of Energy’s Oak Ridge National Laboratory to study an elaborate molecular pathway called nucleotide excision repair, or NER. NER relies on an array of highly dynamic protein complexes to cut out, or excise, damaged DNA with surgical precision.

In their latest study, published in Nature Communications, the team built a computer model of a critical NER component called the pre-incision complex, or PInC. PInC plays a key role in regulating DNA repair processes in the latter stages of the NER pathway. Decoding NER’s sophisticated sequence of events and the role of PInC in the pathway could provide key insights into developing novel treatments and preventing conditions that lead to premature aging and certain types of cancer.

“We’re interested in the way cells repair their genetic material,” said lead investigator Ivaylo Ivanov, a chemistry professor at Georgia State University. “NER is a versatile pathway that repairs all kinds of different DNA damage using a three-stage process that relies on delicately balanced molecular machinery. Unfortunately, harmful mutations can develop that interfere with this machinery and cause severe human diseases.”

“Yet, the effects of genetic mutations can be strikingly different depending on their positions within the repair complexes. In some cases, mutations result in patients having UV light sensitivity and an extreme cancer predisposition. In other cases, they cause abnormal development and premature aging,” he said. “Why that happens is not completely understood at the molecular level. That’s the mystery our computer modeling efforts aim to unravel.”

The three acts of repair

NER unfolds in three distinct stages: recognition, verification and repair. Each stage requires different groups of proteins to perform specific functions, much like a trauma team has different specialists needed to treat injured patients in the emergency room. In that way, the NER machinery can adapt and change its shape depending on the task at hand.

In the first stage, the NER protein XPC (xeroderma pigmentosum group C) acts like a first responder that locates the site of the damaged DNA, or lesion, and then twists the DNA helix to make the damage accessible. XPC then calls in other repair proteins to help initiate the second stage, called damage verification, or lesion scanning.

Here, the NER protein machinery shifts into its next shape. As XPC steps back, the protein complex called transcription factor IIH, or TFIIH (pronounced T-F-2-H), moves into position. TFIIH further unwinds the section of DNA and scans the newly exposed strand for lesions.

After that, it’s in the hands of the surgeon — the PInC — in the third and final stage of repair.

With the “patient” stabilized and prepped for surgery, the operation to remove the damaged DNA strand can begin. Two enzymes, XPF and XPG (xeroderma pigmentosum groups F and G), position themselves precisely on each side of the lesion and act as molecular scissors to cut out the damaged segment of DNA.

Once the lesion is removed, new DNA is synthesized to fill in the gap left behind. Finally, the DNA backbone is sealed, and the damaged DNA is restored back to health.

“What we want to know is how the PInC forms after the lesion scanning phase,” Ivanov said. “How does it control the positioning of the two enzyme subunits that perform the dual incision of the damaged DNA strand? And importantly, is there any cross talk between the two enzymes? Do they sense each other?”

“That matters because once the damaged DNA strand is cleaved, it’s vital that the repair process is completed by filling in that gap,” he added. “Otherwise, it will lead to cell death or to the introduction of double-stranded breaks, which are extremely harmful to the cell.”

Answering those questions required the researchers to solve the structure of the PInC. In biology, understanding protein structure is essential for understanding the behavior or function of protein assemblies. The shapes, sizes and interactions of proteins determine how they fit together to form large biomolecular assemblies.

“We integrated the structural model of PInC using data from a variety of biophysical techniques, notably cryo-electron microscopy,” Ivanov said. “But in the end, the computation is what puts everything together.”

Much like the pieces of a jigsaw puzzle, the PInC model had to be assembled from known structures of constituent proteins, and all the individual pieces had to be put together in 3D. However, many of the PInC components had no known experimental structures.

To overcome this challenge, the researchers used a neural network-based model called AlphaFold2 to predict the unknown structures and the interfaces between the proteins that hold PInC together.

Summit’s final simulations

“Computationally, once you assemble the PInC, molecular dynamics simulations of the complex become relatively straightforward, especially on large supercomputers like Summit,” Ivanov said.

Nanoscale Molecular Dynamics, or NAMD, is a molecular dynamics code specifically designed for supercomputers and is used to simulate the movements and interactions of large biomolecular systems that contain millions of atoms. Using NAMD, the research team ran extensive simulations. The number-crunching power of the 200-petaflop Summit supercomputer — capable of performing 200,000 trillion calculations per second — was essential in unraveling the functional dynamics of the PInC complex on a timescale of microseconds.

“The simulations showed us a lot about the complex nature of the PInC machinery. It showed us how these different components move together as modules and the subdivision of this complex into dynamic communities, which form the moving parts of this machine,” Ivanov said.

The findings are significant in that mutations in XPF and XPG can lead to severe human genetic disorders. They include xeroderma pigmentosum, which is a condition that makes people more susceptible to skin cancer, and Cockayne syndrome, which can affect human growth and development, lead to impaired hearing and vision, and speed up the aging process.

“Simulations allow us to zero in on these important regions because mutations that interfere with the function of the NER complex often occur at community interfaces, which are the most dynamic regions of the machine,” Ivanov said. “Now we have a much better understanding of how and from where these disorders manifest.”

Most of the molecular dynamics simulations were performed on Summit. However, after 6 years of production, Summit was retired at the end of 2024.

Looking ahead, Ivanov and his team plan to use Summit’s successor, Frontier, the exascale-class supercomputer that debuted as the world’s most powerful supercomputer when it came online in 2022.

Their work on Frontier will involve examining transcription-coupled NER, which is a DNA repair process that fixes damage in actively transcribed genes to ensure that essential proteins can continue being made.

In addition to Ivanov, the research team includes Jina Yu, Chunli Yan, Tanmoy Paul and Lucas Brewer at Georgia State University; Susan E. Tsutakawa and John A. Tainer at Lawrence Berkeley National Laboratory; Chi-Lin Tsai at the University of Texas MD Anderson Cancer Center; and Samir M. Hamdan at King Abdullah University of Science and Technology.

Related news and publications

New Insights Into a Shapeshifting Protein Complex

Jina Yu et al., “Dynamic conformational switching underlies TFIIH function in transcription and DNA repair and impacts genetic diseases,” Nature Communications 14 (2023), https://doi.org/10.1038/s41467-023-38416-6.

Frontier is managed and operated by ORNL’s Oak Ridge Leadership Computing Facility, a DOE Office of Science user facility. The OLCF also managed and operated Summit until its decommissioning in November 2024.

UT-Battelle manages ORNL for DOE’s Office of Science, the single largest supporter of basic research in the physical sciences in the United States. The Office of Science is working to address some of the most pressing challenges of our time. For more information, please visit https://energy.gov/science.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.