Please consider supporting us by disabling your content blocker.
loader

AI in Biomedical Research

Seventy-five years ago, Alan Turing posed a simple but powerful question that changed the course of technology: Are machines capable of thought?

Since then, artificial intelligence has advanced at an extraordinary pace, and today, it’s opening the door to the “digital age” of biology.

From leveraging machine learning to help visualize the location and interactions of proteins within live cells to training a deep-learning model that can predict the impact of gene perturbations in cell types or genes, the application of AI methodologies to make sense of and draw insights from massive amounts of scientific data is ushering in a new level of insights into human health and disease.

Basic Science Goes Hand-in-Hand With AI 🤝

“Biologists are going to have very strong simulations enabled by virtual cell models — in a way that’s not possible today,” said Steve Quake, CZI’s head of science, during the opening remarks. His point emphasized how AI will fundamentally change and accelerate the way scientists do research in the coming years.

For example, the virtual cell models CZI is building will be able to predict the response of immune cells to different genetic mutations faster and in more robust combinations than current methods without the need to collect costly and invasive physical samples from patients.

Marinka Zitnik, assistant professor of biomedical informatics at Harvard Medical School and associate faculty at the Kempner Institute for the Study of Natural and Artificial Intelligence, led a session that further highlighted AI’s role in transforming scientific research in the context of her day-to-day work.

One example is SHEPHERD, a deep learning approach built by Zitnik’s team that can provide individualized diagnoses of rare genetic diseases. Given the limited data on rare diseases, the model is pre-trained on known associations between variants, genes and phenotypes from patient-agnostic data.

Marinka Zitnik speaking at a podium.
Marinka Zitnik, assistant professor of biomedical informatics at Harvard Medical School and associate faculty at the Kempner Institute for the Study of Natural and Artificial Intelligence.

When evaluated across 12 sites throughout the United States, SHEPHERD was able to nominate disease-causing genes for 75% of patients from a cohort affiliated with the Undiagnosed Diseases Network.

A Multimodal Data Oasis 🌊

Over the last decade, scientists, academic research labs and philanthropic organizations like CZI have been collecting, aggregating and curating enormous amounts of detailed, high-resolution biological information about the trillions of cells within the human body. These datasets are sequence- or image-based — two complementary modalities that are fundamental to advancing biomedical research.

Manu Leonetti, director of systems biology at the Chan Zuckerberg Biohub San Francisco (CZ Biohub SF), and James Zou, associate professor of biomedical data science at Stanford University, led discussions about the opportunities with training AI on multimodal datasets.

“Imaging has the power of being able to give us extremely dense multimodal profiles of cells,” said Leonetti. “We can ask questions across scales while following cells in the context of their native environment, whether looking at cells in a dish, or tissues, or even at the scale of an entire organism.”

Panel discussion on AI in biomedical research.
Manu Leonetti (middle), director of systems biology at CZ Biohub SF, and James Zou (right), associate professor of biomedical data science at Stanford University, in conversation with Ivana Jelic (left), senior program manager for Cell Science at CZI.

Zou also shared examples of how generative AI is transforming biomedicine, including a case study showing how models can help identify and synthesize molecules to guide the development of antibiotics.

A ‘General-Purpose Model’ To Power Basic Science 💪

Today, most of the field’s AI models are designed for applications in specific research areas, whether in the context of identifying genetic mutations that can lead to rare diseases or identifying new molecules that can overpower antibiotic-resistant pathogens.

But in the future, CZI’s goal is to build and train a “general-purpose model” or virtual cells that can transfer information across datasets and conditions, serve multiple queries concurrently, and unify data from different modalities.

Boris Power and Priscilla Chan in a panel discussion.
Boris Power, head of applied research at OpenAI (left), and Priscilla Chan, co-founder and co-CEO of the Chan Zuckerberg Initiative.

Theofanis Karalestos, CZI’s head of AI for science, provided attendees with a closer look at our vision for building a general-purpose model that can serve as a foundational resource for biomedical research.

By bridging the gap between these datasets and advances in AI, “we get to the heart of where we want to be as machine learners,” said Karaletsos. “We want to simulate a generative process such that in some coarse-grain level of casualty — even if it doesn’t get things exactly right at a fine level — but at some level, we’ll have useful models that will allow us to ask questions about the data and query them in interesting ways for counterfactuals.”

Ultimately, this approach will pave the way for an open, accessible digital platform for biology, which will house next-generation models and systems trained on expansive multimodal datasets.

Learn more about CZI’s AI strategy for science and our vision to build predictive models of cells and cell systems.