Debunking myths about GenAI and data: what life sciences companies need to know

The buzz around generative AI (GenAI) is impossible to ignore. From boardrooms to labs, life science companies are racing to unlock the potential of large language models (LLMs) for research, decision support, and discovery. But in the rush, many are falling for dangerous myths, particularly about the role of data. It’s becoming more apparent by the day: GenAI is not a magic wand that fixes foundational problems. If anything, it amplifies them.

Recently, the Pistoia Alliance hosted a webinar entitled, “AI-ready data and why FAIR matters in life sciences companies,” featuring experts from EPAM Systems, ONTOFORCE, Roche, and XponentL Data. Our ONTOFORCE expert presented on how platforms and architecture must evolve to meet GenAI needs and what this means for FAIR data. Based on this discussion, we’re sharing five persistent myths that often pop up related to data, tech systems, and GenAI implementation.

Myth #1: “GenAI can understand any data, even if it’s a mess”

Reality: If your data is confusing for humans, it’s confusing for AI too—just in different (and riskier) ways.

GenAI has the same problems that people do. Data that is poorly structured, inconsistently labeled, or hidden in silos doesn’t magically become useful just because a language model is involved. In fact, GenAI might appear to “work,” but the results can be misleading, unrepeatable, or simply wrong.

Worse, unlike humans, AI doesn’t always signal when it’s confused. An LLM can confidently produce an output, backed by flawed assumptions or missing context, creating a false sense of trust. The same upstream data issues that have always plagued analytics and discovery don’t disappear; they just resurface in subtler, sometimes more dangerous ways.

Ultimately, treating GenAI as a shortcut around data preparation is wishful thinking. Without clear semantics, structure, and metadata, even the most powerful models are flying blind. This is why FAIR data—data that is Findable, Accessible, Interoperable, and Reusable—remains as critical as ever.

Myth #2: “We don’t need to invest in data architecture anymore”

Reality: GenAI puts new demands on your data architecture.

Traditional knowledge platforms operate within clearly defined boundaries: users interact with structured filters, predefined views, and step-by-step workflows. But introduce a generative AI assistant, and those boundaries vanish. Users can now pose open-ended, creative queries—queries the system has likely never seen before.

data architecture for GenAI in pharma ONTOFORCE

This shift puts entirely new pressures on the architecture. Back-end systems must process unexpected combinations of concepts, draw from diverse datasets in real time, and support more dynamic, flexible query models. Possibility opens up, which is great for end users, but far more difficult for platforms to handle. The result? Legacy systems often buckle under the weight of GenAI’s flexibility, leading to performance issues, inconsistent outputs, and increased development overhead.

Rather than making architecture obsolete, GenAI demands smarter, more robust foundations, ones built with semantic awareness, modularity, and performance in mind. Investing in scalable, FAIR-aligned architecture isn’t a detour from AI progress…it’s a prerequisite.

Your architecture needs to evolve:

Better indexing, caching, and semantic modeling

More flexible query engines

Smarter integration with knowledge graphs

Myth #3: “Natural language interfaces make everything easier”

Reality: Only sometimes, and not for everyone.

There’s a common assumption that switching to a chat interface automatically improves usability. But in practice, this can backfire, especially in scientific and research contexts. Expert users often prefer precise, structured interactions (filters, tables, etc.) over verbose conversational interfaces that may slow down workflows or obscure control.

Effective UX in GenAI doesn’t mean replacing everything with a chatbot. It means offering natural language where it adds value, while preserving trusted tools that support transparency, friction, and deeper inspection. The goal isn’t less friction…it’s the right friction, in the right places.

Natural language feels intuitive for casual users. But expert users like researchers, clinicians, and data scientists often prefer more direct, visual tools to explore and manipulate data. Even worse, users may over-rely on answers, even when they’re wrong or unverified.

That’s why user experience design must still account for:

Validation steps (a.k.a. “productive friction”)

Structured filtering and drill-down options

Clear provenance of AI-generated outputs

Myth #4: “We can fully trust the AI”

Reality: Trust with caution, and employ knowledge graphs to improve trust.

One of the most underappreciated risks with GenAI is overtrust. LLMs are incredibly good at producing fluent, authoritative-sounding answers. This can lead users, especially non-experts, to assume the output is correct. But without understanding how that answer was generated or what data it was based on, that trust can be dangerously misplaced.

GenAI over trust in the life sciences industry ONTOFORCE

Even more concerning is the illusion of reliability. Users may place unwarranted confidence in AI-generated outputs, bypassing critical validation steps simply because the response looks polished or comes with cited sources.

This is where knowledge graphs come in. When you infuse LLMs with structured, curated knowledge, like ontologies and graph-based relationships, you can reduce hallucinations, clarify provenance, and enable more explainable outputs. In fact, resent research into the topic showed that combining LLMs with knowledge graphs can yield up to 3x more accurate query results. These semantic frameworks don’t just make the data smarter, they also anchor GenAI in validated, transparent sources, giving users a reason to trust (and verify) what they see.

Ultimately, GenAI should be a tool, not a black box. True trust comes from pairing it with structured knowledge, FAIR data, and human-in-the-loop oversight.

Myth #5: “GenAI Is just a smarter search bar”

Reality: GenAI doesn’t just search better, it changes how users interact with your entire data ecosystem.

GenAI has provided a new paradigm for how people engage with data. Unlike traditional platforms that guide users through structured steps, GenAI invites open-ended, often unpredictable queries that cross domains, formats, and contexts. This sounds empowering (and often is), but it breaks assumptions baked into most existing systems. For instance:

Queries can now span across concepts that were never connected in legacy schemas.

Performance bottlenecks arise when systems are forced to assemble “creative” combinations of data.

Visualization logic must adapt to support new output types on the fly: users might ask for a table, a chart, or a summary narrative, all in one session.

GenAI isn’t just an interface upgrade, it’s a catalyst for reimagining how data systems work. If you treat it like a smarter search bar, you’ll hit limitations fast. To unlock its true value, you need systems designed for flexibility, semantic awareness, and dynamic user expectations.

Don’t fall for these GenAI myths

You can learn more about this topic by watching ONTOFORCE’s portion of the webinar in the video below. For additional resources on the use of AI in the life sciences industry, the adoption of the FAIR data principles, and more, explore the Pistoia Alliance’s projects, communities, and trainings.