News Release

Accelerating drug discovery with the CCDC, AWS, and Intel

Curated data set of protein structures from the protein data bank with predicted hydrogen positions now available

Reports and Proceedings

CCDC - Cambridge Crystallographic Data Centre

A Protein Structure

image: 

A protein structure.

view more 

Credit: The CCDC.

Thanks to the combined computing power of Amazon Web Services (AWS) and Intel, the CCDC announces that a potentially significant advancement in drug discovery has been achieved. A curated data set of protein structures from the Protein Data Bank (PDB) with predicted hydrogen positions is now available for download. This project was supported by an Intel RISE Technology Initiative contribution.

Historically, collaborations with the pharmaceutical industry have enabled the development of reliable methods for interpreting interactions within protein binding sites using proprietary information not publicly available. Repeating these studies with PDB structures presented a challenge due to the absence of hydrogen positions in water networks within the proteins. Reliable predictions require databases of augmented protein structures where hydrogen positions are assigned.

Generating this information computationally is intensive, considering multiple possible models. Overcoming this computational challenge was possible for the CCDC through the combined power of Intel and AWS. The CCDC generated a comprehensive snapshot of protein cavities in the PDB, identifying potential binding sites for small molecules with accurately predicted hydrogen positions for all components.

Key Benefits

  • Accessibility: This data set is freely available, enabling widespread use in drug discovery research and development.
  • Efficiency: By providing precomputed hydrogen positions, researchers can save valuable time and resources, eliminating the need for redundant computations.
  • Environmental Impact: Reducing the necessity for repeated computations lowers the environmental footprint of large-scale computational tasks.

We were delighted to partner with AWS and Intel on this project to provide another valuable structural science resource to enhance the drug discovery process in the pharmaceutical industry. The output from the project now being free to all further emphasizes our commitment to FAIR data and our consideration of the environmental impact of repeated computation.

Dr Juergen Harter, CCDC CEO

With the power of Intel and AWS, we’ve presented researchers with predictions of protonation states in important protein structures, potentially saving hundreds of thousands of hours of life sciences research time across the globe.

Jason Cole, Senior Research Fellow, CCDC

Download the Protonated PDB Files

Researchers and developers in the field of drug discovery can download the protonated PDB files from the CCDC download page. This initiative democratizes access to critical data, empowering scientific advancement regardless of access to extensive computational resources.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.