Inside AstraZeneca’s clinical data reuse for multi-modal research

Clinical trial data is one of the most valuable assets in drug development, but too often, it remains underused. Scientists may know about studies in their own team or unit but miss relevant data elsewhere in the organization. At AstraZeneca, the challenge was clear: how to break down silos, enable reuse, and give researchers the ability to ask complex, multi-faceted questions across domains. By rethinking how clinical data is integrated, accessed, and explored, AstraZeneca is turning existing information into fuel for multi-modal research.

The challenge of complex questions

Modern drug discovery rarely hinges on a single dataset. Today’s scientists need to combine information across multiple domains to answer research questions, explore hypotheses, and more.

Consider a translational researcher looking to further explore the relationship between a drug, an adverse event, and a biomarker.

This requires finding a cohort of patients that:

Took a drug during their participation in a trial,

experienced a specific adverse event,

possess a specific biomarker,

and have next generation sequencing data available.

Answering a query like this requires weaving together clinical study records, subject-level descriptors, biological samples, testing results, and sequencing outputs. Traditionally, this has meant pulling information manually from different platforms with each managing its own domain of data. The process could take weeks, and even then, the results can still be incomplete or fragmented.

For AstraZeneca, this fragmentation created real limitations. Scientists risked overlooking valuable studies outside their immediate project teams, making it harder to assemble robust cohorts or identify the right samples for testing a hypothesis. Study design was slowed down, confidence in the available data was reduced, and opportunities to reuse past clinical trial information were left untapped.

Building an integrated knowledge graph

To overcome these challenges, AstraZeneca reimagined how clinical data could be structured and connected. The solution was to bring together information from across domains—clinical study metadata, subject-level descriptors, biological samples, imaging, and sequencing—into a single, navigable framework.

At the heart of this approach is a knowledge graph. By linking data hierarchically from study → subject → sample → observations, AstraZeneca created a model that reflects the real-world relationships between different types of research data. This makes it possible to move seamlessly from a high-level view of studies down to the details of individual samples or datasets.

An intuitive user interface to navigate complex data

AstraZeneca didn’t stop at building a knowledge graph, they made it accessible to scientists through their Scientific Intelligence platform, powered by DISQOVER. The intuitive platform turns a complex data model into a usable, visual experience for researchers.

DISQOVER is ONTOFORCE's flagship product built on knowledge graph and semantic search technology, designed exclusively for the life sciences industry. DISQOVER integrates and harmonizes diverse data sources and enriches them with pre-ingested biomedical and other publicly available industry data. Its intuitive user interface and interactive data visualizations allows users to efficiently search, explore, and filter the connected data.

Through customized dashboards, scientists at AstraZeneca can start at a high level, exploring studies by therapeutic area, indication, or drug. From there, they can drill down into subject-level information such as demographics, lab results, adverse events, or co-medications. At the sample level, they can see which biofluids were collected, the anatomical source, and whether inventory is available for ordering.

The key advantage is the ability to combine and layer filters across different data views. Scientists can refine their patient cohort criteria step by step, narrowing down subjects, selecting samples, and identifying datasets that align with their research question. If they know exactly what they want to explore, researchers can start their search by creating a detailed query encompassing all their criteria using DISQOVER’s query builder feature with no coding skills required. Instead of navigating multiple disconnected systems, they can do it all in one place, with full traceability and compliance baked in.

The real impact of clinical data reuse at AstraZeneca

The true value of AstraZeneca’s approach is in the time it saves and the confidence it creates. Tasks that once stretched over weeks of manual searching and cross-checking can now be completed in minutes. Scientists no longer have to guess whether relevant data exists. Instead, they can see it, explore it, and act on it.

One example illustrates the impact clearly: A researcher initially identified three studies that seemed to fit his criteria. By using Scientific Intelligence, he uncovered additional studies across the organization and ultimately tripled the number of suitable subjects for analysis. With a larger, more representative cohort, the researcher could move forward with far greater confidence that their study would yield meaningful results.

This acceleration also extends to study design. Clinicians and statisticians can now quickly review the distribution of lab tests or adverse events within a cohort, helping them power studies more accurately and reduce risks before trials begin.

Learn more: strategic clinical data reuse

AstraZeneca’s journey shows how reusing clinical data can transform research by turning scattered datasets into a strategic asset for multi-modal science. By connecting studies, subjects, samples, and observations in one framework, scientists gain the power to ask more complex questions, design smarter studies, and accelerate the path to development.

For a deeper dive into this approach, check out the full case study and gain further insight into the tailored dashboards that AstraZeneca researchers are using. Ready the full case study >>>>

To hear more about the foundations underpinning AstraZeneca’s data reuse strategy, watch this webinar recording with AstraZeneca's Ben Gardner. He delves into the importance of improving the accessibility and usability of clinical trial data. Watch the webinar recording >>>

Inside AstraZeneca’s clinical data reuse for multi-modal research

The challenge of complex questions

Building an integrated knowledge graph

An intuitive user interface to navigate complex data

The real impact of clinical data reuse at AstraZeneca

Learn more: strategic clinical data reuse

SOLUTIONS

TECHNOLOGY

RESOURCES

COMPANY