BLOG
Lessons from Boehringer Ingelheim’s journey toward strategic clinical data reuse.
Life sciences organizations conducting clinical trials accumulate vast amounts of data over time, with each clinical trial generating significant quantities of patient data, observations, and findings. Historically, this data often remained siloed, underutilized beyond the initial reporting requirements. Now, forward-thinking companies recognize the untapped potential within their historical clinical trial datasets. By effectively leveraging this legacy data, organizations can significantly enhance their research capabilities, accelerate scientific discovery, and optimize resource utilization.
ONTOFORCE invited Karsten Quast, Data Domain Owner at Boehringer Ingelheim, to speak at our recent webinar focused on reusing clinical data and cohort building. As Boehringer Ingelheim will soon be operational with a platform powered by ONTOFORCE’s DISQOVER to drive clinical data reuse and cohort building, we wanted to learn more about:
During the webinar, Karsten highlighted Boehringer Ingelheim’s ReSource project. The main objective of the project is to harmonize vast volumes of historical clinical trial data dating back to 1998. The ReSource project was thus launched with aims to:
Based on the insight Karsten shared during the webinar related to the ReSource project, in this blog we expand on a few lessons learned when it comes to reusing clinical data.
The volume of clinical trial data continues to grow exponentially. Technological advancements, along with more comprehensive trial designs, contribute to increasingly complex and voluminous data sets. Data from clinical trials now frequently encompass detailed patient demographics, vital sign data, treatment data and patient-reported outcomes, various efficacy measures, imaging results, omics data, and more.
Today, it’s estimated that Phase II and III protocols involve an average of 263 procedures per patient, supporting approximately 20 distinct endpoints, reflecting the expanding scope of what trials aim to measure and analyze. As a result, Phase III trials now produce an average of 3.6 million data points, a threefold increase compared to late-stage trials conducted just a decade ago.
With these massive amounts of data points, it’s no surprise that when it comes to reusing clinical data, not all data is of relevance or value. To effectively leverage legacy clinical data, consideration should be given into which data is most relevant and valuable for the specific reuse applications. For example, administrative or logistical metadata, such as timestamps for trial monitoring visits, are essential during the active trial management phase. However, they typically hold little analytical value for subsequent analyses aimed at therapeutic insights, cohort selection, or predictive modeling.
As Karsten pointed out, it’s best to avoid the issue of becoming overwhelmed by the efforts to harmonize and reconcile all legacy clinical data. Within the context of Boehringer’s ReSource project, this means that only certain data is included for reuse. Specifically, the project focuses on various Study Data Tabulation Model (SDTM) domains, such as demographics, adverse events, lab results, or concomitant drugs.
Data ingestion and harmonization can be time and resource-intensive processes. Focusing only on high-value data helps to ensure that neither are wasted. To further address the challenges of ingesting and harmonizing complex and voluminous legacy clinical data, DISQOVER provides an out-of-the-box SDTM pipeline, enabling users to seamlessly ingest their own SDTM-formatted data. With this feature, researchers can quickly ingest legacy datasets into the DISQOVER platform to accelerate the process of generating actionable insights from previously underutilized clinical trial information.
Reusing legacy clinical trial data unlocks a vast reservoir of untapped knowledge that can drive innovation and efficiency across pharmaceutical research and development. When effectively utilized, this historical data not only streamlines research processes but also enhances the accuracy and depth of scientific inquiries. By minimizing redundant efforts and rapidly enabling precise cohort selections, clinical data reuse empowers researchers to test hypotheses more efficiently, design more robust clinical trials, or uncover hidden insights that inform new therapeutic strategies.
To tap into their underlying reservoir of knowledge, Boehringer’s ReSource project focuses on leveraging the full potential of historical data, supporting evidence generation through multiple angles, including:
During the webinar, Karsten highlighted these use cases, which emphasize the power of clinical data reuse in fostering new pathways to scientific insight. By unlocking previously underutilized historical datasets, researchers can approach scientific questions from multiple directions, significantly enhancing their ability to innovate, refine therapies, and accelerate impactful research outcomes.
For Boehringer, unlocking the full potential of clinical data requires easy access to the data, along with the ability to exploratively view it. As Kasten mentioned, the ReSource project is focused on:
Making existing clinical trial data easily and quickly available for reuse
and
Providing a holistic exploratory view of this data, in compliance with regulatory, legal, and ethical requirement.
Thanks to DISQOVER’s intuitive user interface (UI) with powerful data visualizations and interactions, achieving these goals is straightforward. No prior knowledge of previous trials or extensive data science experience is needed to extract value from the platform. DISQOVER enables users to intuitively navigate all existing clinical data ingested in the platform, effortlessly defining and refining cohort criteria through dynamic visualizations and filtering options. By presenting data clearly and interactively, users feel encouraged and empowered to explore the legacy data thoroughly, leading to deeper insights and more informed decisions.
You can learn more about Boehringer Ingelheim’s journey toward strategic clinical data reuse and their ReSource project by watching the webinar recording. Specifically, learn more about how data is harmonized, how Boehringer ensures rigorous compliance and data protection when reusing data, and the dynamic user interface powered by DISQOVER. Watch the recording now >>>
With their comprehensive ReSource project, Boehringer Ingelheim exemplifies how pharmaceutical organizations can maximize the immense potential of legacy clinical data, transforming previously siloed datasets into powerful resources driving therapeutic innovation and breakthroughs.
ONTOFORCE enables life science companies to unlock hidden insights from data.
With DISQOVER, built on knowledge graph technology, we support life sciences and pharmaceutical companies with innovative data management and visualization.
Proudly ISO 27001:2022 certified.
© 2025 ONTOFORCE All right reserved