News Release

Insilico IMGAIA recap: how generative AI applications could make the world better through accelerated scientific research

Business Announcement

InSilico Medicine

Key Questions Addressed:

  • At Insilico, what are generative AI tools capable of, and where do the advantages lie?
  • How do the AI tools handle data, when it is constantly changing and potentially confidential?
  • How far are the updated applications away from market access or trial versions?
  • How does Insilico handle current generative AI problems, including hallucination and copyright concerns?

 

On the recent Insilico Medicine Generative AI Action (IMGAIA) webinar, the technology and strategic updates of Insilico Medicine  (“Insilico”), the global leading clinical-stage generative artificial intelligence (AI)-driven drug discovery company, attracted more than 500 attendees from across the globe. Followed by the announcement of Generative AI for Sustainability Consortium, key project leaders provided demos and insights into Biology42: PandaOmics Box hardware for confidential computing, Precious-3 GPT for virtual data generation and biomedical research, and Science42: DORA for drafting scientific documents.

For those who have missed the virtual launch, the whole screencast is available here, and the Q&A highlights have been curated as follows. If you are interested in trial versions, please contact  BD@insilicomedicine.com.

 

Section 1: At Insilico, what are generative AI tools capable of, and where do the advantages lie?

Q: Can precious GPT analyze gut microbiome data?

A: Currently, microbiome is out of scope. But we can add that data type in the future, maybe in Precious-4 GPT, or a later version of Precious-3 GPT. It's actually not that hard. It's just we wanted to work with the data types that are directly interlinked, like the methylation transcriptomics and proteomics. We will be adding additional omics data types.

 

Q: If I paste a document with embedded citations from Zotero or Mendeley, do the citations automatically get picked up by DORA?

A: Currently, you can use a bibliography directly in DORA by using the "insert citation" option. You can upload your document manually, and an automated file upload feature is being developed. This will allow DORA to recognize citations from Word files. 

The platform also focuses on converting citations from different journals and will include new features with weekly updates. The inline editing option in DORA makes it easy to add references and text directly in the browser. For more details on upcoming features, you can visit dora.insilico.com and check the recent blog post.

 

Q: Is there public information with regards to the backend and model architecture of Generative Biologics, as well as the validation experiments that were conducted?

A: A whitepaper will soon be published detailing the architecture of Generative Biologics, which is somehow similar to Chemistry42, with multiple generative models optimizing the reward function to capture how the biologics bind to the targets and the interaction properties. 

Unlike Chemistry 42, generative biologics will initially be accessible only to platform partners for trials and validation experiments. Also, Insilico has been carrying out internal validations throughout the process. In small molecule chemistry, the generative systems have already shown success with a candidate drug in Phase II. In biologics, we decided to take it slower and ensure that the community validates. 

Interested partners with validation capabilities can apply for free access to collaborate. The tool will soon be available for sale, and the first two paying customers have been acquired, although their deals encompass more than just Generative Biologics.

 

Section 2: How do the AI tools handle data, when it is constantly changing and potentially confidential?

Q: How often will the references database be updated for DORA? 

A: For references, the process has been streamlined. Automated updates happen weekly. For some other databases that require more harmonization and compute, the cadence is generally monthly. Depending on the module you use inside DORA, we have connections to PandaOmics and Precious-3 GPT.

In a word, you will have weekly or monthly cadence of updates. 

 

Section 3: How far are the updated applications away from market access or trial versions?

Q: What is the cost of the PandaOmics Box?

A: As PandaOmics Box offers various configurations and versions tailored for different uses, including hospital applications and local paper drafting, OMICs data analysis, and target discovery, the price varies generally from 50K to 100K, and that is much higher than typical desktop systems. 

It does not require Wi-Fi, operating with significant computational power akin to a large industrial cloud-based platform but designed for on-premise use. It supports open API for scripting, and ensures data protection by conducting all computations locally on a secure on-chip system developed with Intel, making it ideal for users prioritizing data security and privacy.

 

Section 4: How does Insilico handle current generative AI problems, including hallucination and copyright concerns?

Q: How do we handle hallucinations with our large language models?

A: We use RAG (Retrieval-augmented generation). That's the base point. 

In order to produce the text in any document, it would refer to the source materials. And the agents using [algorithms including] not only RAG, will just read the papers relevant to the research subject. They also analyze knowledge graphs, and main data sources from pandaOmics. 

But basically, we don't leave a lot of room for hallucinations there. It is grounded towards all of the references we provide inside DORA. And same for PandaOmics and Chemistry42. We try to provide as much structured, reliable information as possible for the context. 

 

Q: Some scientific journals check the text of submitted publications for the use of AI and generally consider such articles not entirely original. How is DORA responding to this trend?

A: With that, we've actually talked to several major publishing houses, and they allow for the use of generative AI in paper writing, as long as the research is original and it's properly referenced. To help with that, DORA automatically adds itself to materials and methods, and to all the disclosures. While top-level journals generally accept AI-generated content with proper disclosure, direct submission from DORA is not recommended.

DORA stands for Draft Outline Research Assistant. Its main goal is to generate drafts for scientific documents, while providing you with a window to access many of the tools at Insilico the AI-driven drug discovery company at the same time.

The template that we've created has a workflow with multiple agents performing many, many different tasks, so you can actually try to query it for a list of targets, or implicate specific proteins in diseases of interest. It's not just a paper writer.
 

Q: How was the validation done to make sure that it indeed interpreted the paper correctly and has not hallucinated?

A: DORA uses original references from the reference database to arrive at the reference you like. And actually, it will also give you a lot of original references as options with different parameters that, for example, the number of citations, the publication year, and the journal name.

You can also click on the individual links and go to the paper and see if it's real. And also go through the abstract and see if you want it to be included. But usually those are original references and they are not derived out of the large language model. They are picked and prioritized by the large language model for you to include.

In 2016, Insilico first described the concept of using generative AI for the design of novel molecules in a peer-reviewed journal, which laid the foundation for the commercially available Pharma.AI platform. Since then, Insilico keeps integrating technical breakthroughs into Pharma.AI platform, which is currently a generative AI-powered solution spanning across biology, chemistry and clinical development. Powered by Pharma.AI, Insilico has nominated 18 preclinical candidates in its comprehensive portfolio of over 30 assets since 2021 and has received IND approval for 9 molecules.

 

About Insilico Medicine

Insilico Medicine, a global clinical stage biotechnology company powered by generative AI, is connecting biology, chemistry and clinical trials analysis using next-generation AI systems. The company has developed AI platforms that utilize deep generative models, reinforcement learning, transformers and other modern machine learning techniques for novel target discovery and the generation of novel molecular structures with desired properties. Insilico Medicine is developing breakthrough solutions to discover and develop innovative drugs for cancer, fibrosis, immunity, central nervous system diseases, infectious diseases, autoimmune diseases, and aging-related diseases. 

www.insilico.com


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.