Get more out of GenAI with effective data governance powered by knowledge graphs  ONTOFORCE  2023


Get more out of GenAI with effective data governance powered by knowledge graphs

Generative AI, or GenAI, has undoubtedly caught the public attention. For life sciences businesses, the question has become how data governance is important for using GenAI to create more operational efficiency and a competitive advantage.

25 March 2024 5 minutes

Today’s life sciences industry is generating massive amounts of data. The formats of these data sources are highly heterogeneous, and often incomprehensible to human users without software intervention. In order to make the best use of all these disparate data sources, companies are moving fast to employ Generative AI (GenAI) to utilize existing data to facilitate their operational and innovation processes. However, it is crucial that the correct infrastructure and controls are put in place to ensure the integrity of the data being used for GenAI.

This blog considers the importance of effective data governance in order to properly leverage GenAI while still maintaining regulatory compliance, data integrity and security. Used appropriately and effectively, GenAI can provide major competitive advantages to life sciences organizations, such as enhancing drug safety, enhanced prediction of new candidates, and a streamlined development process.

Why is data governance valuable for GenAI in the life sciences industry?

In the rush to innovate and adopt the latest technologies, an effective data governance policy can sometimes lag behind. Historically, data governance has focused on data protection and security. However, the way that different types of data are stored and accessed is becoming more and more crucial for generating new insights.

Here's how Bhavish Madurai, AI and Data Managing Director, Life Sciences at Deloitte, worded it in our recent online panel "Perspectives on AI, LLMs, and semantic technologies in the life sciences industry" (February 2024): “In the life sciences industry, precision and accuracy are paramount. GenAI, in particular, relies on high quality well-governed data to produce meaningful insights and outcomes.”

Data governance is a foundational pillar for the successful adoption of GenAI technology through establishing frameworks, policies, and standards. Implementing strong data governance helps an organization maintain data quality and integrity, which is critical when training GenAI models. On top of this, data governance helps companies manage their data and avoid data siloing so that people, process, and technology are seamlessly integrated. Without proper data governance strategies in place, data quality and integrity suffer, in turn impacting the quality of outputs for GenAI. Garbage in = garbage out.

In all, effective data governance helps to:

  1. Reduce operating costs and improve operational efficiency
  2. Establish accountability and responsibility for data sourcing and AI model development
  3. Improve quality, reputation, and trust in GenAI model outputs
  4. Support wider business goals through the effective use of GenAI

What are best practices for effective data governance?

GenAI has created an inflexion point with data governance that never existed previously. Systems now need to shift from a sole risk management perspective, centered around protecting the company and the privacy of users, to also creating effective methods to manage the value that can be obtained from data.

When establishing effective data governance that supports GenAI initiatives, there are a few main areas that should be considered:

  1. One size doesn’t fit all: establishing a fit-for-purpose framework that balances the global aspect of available data with the company’s local needs, and is customizable, is essential.
  2. Use tools that embed data governance in workflows: investing in user-centered, self-service analytics platforms that are easy for colleagues to use in their workflow is helpful in embedding data governance in how people work, rather than it being an add-on after tasks and processes are completed.
  3. Cultivate a culture of collaboration: empowering a culture where technology experts and domain experts can collaborate enables technological innovations that are seamlessly integrated with domain-specific expertise. Bringing people together who are solving different but overlapping problems can help to both build community and accelerate the ways in which they work. This requires a major evolution from a data stewardship council mindset focused on control and restraints to creatively considering ways to learn from each other and the data.
  4. Continuous improvement: ensuring continuous improvement and enablement are crucial to keep up with the ever-evolving world of GenAI. Data governance policies and frameworks will therefore need to evolve in tandem.
  5. Leverage semantic technology: semantic technology can be used to enrich more traditional data sources, providing more interoperability and more effective data integration. This will help considerably with embedding data governance directly into workflows.
  6. Data providence: knowing where data comes from is a critical part of ensuring the reliability of any predictions and in troubleshooting when something fails to work as expected. Data governance frameworks should account for proper providence processes to support GenAI in the long term.

How can technology help with implementing these best practices?

Jeremy Forman Life Sciences Data & AI Leader SEagen &  ONTOFORCE

Jeremy Forman, Executive Director Data Strategy, Science and Platforms at Pfizer stated during our recent dedicated online panel:

“The world of AI, the world of data, the world of Gen AI is changing constantly, it's going to continue to change constantly. Policies and global frameworks are going to continue to evolve, so the frameworks and the technologies have to evolve in lockstep.”


Taking a step back, a company’s journey towards digital transformation should not lose sight of data and data management transformation. Ensuring data is properly managed and governed is a foundational investment in any digital transformation strategy. As such, a comprehensive and customizable tool that assists with data management while embedding data governance frameworks can provide many benefits that can help to prepare for the adoption of innovative technologies in the future. Depending on the provider, a knowledge graph platform can be such a tool, propelling organizations further with their data governance and GenAI initiatives.  

Knowledge graphs are an effective way to manage a wide range of data

Bhavish Madurai, AI & Data Managing Director – Life Sciences at Deloitte &  ONTOFORCE"Once you have good a good repository, how do you extract the knowledge from that? So that's where a knowledge graph comes in. So a knowledge graph could then pick up relationships between the semantic connections, entities and concepts, and then use that for data harmonization. This allows the stakeholders to explain the AI model and answer the question “how did I come to a particular decision?"

                                                                   Bhavish MaduraiAI and Data Managing Director, Life Sciences at Deloitte

Bringing together the diverse data sources generated by life sciences research and drug development is essential for GenAI to generate effective outputs. The existence of multiple types of software with disparate data sources and formats contributes to data siloing. This increases the “integration tax” for companies wishing to use such data, costing both time and money.

Knowledge graphs provide a structured way to define data models, metadata, and access controls in a way that facilitates data integration, discovery, and analysis which in turn can support data governance frameworks and policies. On top of this, knowledge graphs can significantly enhance GenAI models by providing a structured and interconnected framework of data, which aids in the understanding of complex biological, chemical, and medical relationships.

In this way, knowledge graphs essentially bring humans and machines together in a specific context that allows for better intelligence. In addition, knowledge graphs ensure data is standardized and harmonized, supporting data governance strategies and providing a structured data foundation for GenAI to operate from.

Hear expert views on the importance of data governance for GenAI in life sciences companies

Recently industry experts Jeremy Forman, Executive Director Data Strategy, Science and Platforms at Pfizer; Bhavish Madurai, AI and Data Managing Director, Life Sciences at Deloitte; and Valerie Morel, CEO of ONTOFORCE sat down for a panel discussion focused on AI, LLMs, and semantic technology in the life sciences industry.

They offer their perspectives on the effective use of GenAI in the industry, provide tangible suggestions to create an effective data governance policy to support GenAI initiatives, and more. Watch the recording to hear directly from these industry experts >>>