I recently spoke at the iRODS User Group Meeting 2018 (June 5-7 2018, Durham, NC, USA) on the FAIR principles and how our research community is using the semantic platform DISQOVER in our DataHub infrastructure. Here’s the story from that session explaining how we link on-premise clinical data with other sources to gain more, better and faster insights.Read more
One of the key types of information in DISQOVER is human genetic variations and their links to genes, diseases etc. We recently integrated two new data sources: 1000 Genomes and gnomAD. These data sources include human variation and genotype data derived from a large set of human individuals.Read more
Integrating an organization’s internal data is challenging. It requires strategic vision by the organization’s leadership and an open mindset on different levels. In our experience, establishing such a solution has a greater chance of success if stakeholders quickly and easily notice the benefits. Small pragmatic initiatives with few data sources that solve a limited number of use-cases, can demonstrate how things can improve for the whole organization.Read more
One and a half years ago we started to work with Amgen to help them improve the way they search and retrieve research data. They were looking for a solution that could aggregate and interlink their internal research data, enrich it with public data and provide an appealing, user-friendly interface for end-users.Read more
FAIR is a fairly recent concept that stands for ‛Findable, Accessible, Interoperable and Reusable’. On the face of it, these principles don’t seem so remarkable. But what sets it apart, compared to other (earlier) open data models, is that the emphasis has shifted from the human researcher to machines.Read more
In the last few years, the number of public data sources integrated into DISQOVER has grown steadily and we crossed the triple digit barrier in 2016. Nevertheless, the number of data sources on our waiting list for integration is growing just as fast.Read more
One of our missions is to directly and indirectly help patients by aggregating and linking both private and public data. A direct help is to facilitate awareness about health and disease by providing proper information about prevention, diagnosis and treatment, amongst other things.Read more
In my previous blog, I tried to explain that the usage of different disease classifications or encodings in data sources like the US and EU clinical trial registries, doesn’t hamper the integration and linking of this kind of data.
Disease classifications are also used to precisely define diseases in other contexts like epidemiology, pharmacovigilance, toxicology, pharmacology, genetics, etc. This data is scattered across a plethora of data sources, maintained by different governmental and other non-profit organizations like research consortia and institutes or individual research groups. If they are keen on providing meaningful and useful data, data providers try to avoid using disease terms that aren’t defined precisely in an ontology.Read more
Anyone who’s ever been involved in data integration will confirm: integration projects are usually cumbersome and very time- and resource-intensive. Those facts alone take the wind out of the sails of many a company. Most, in fact, don’t even begin data integration projects.
But it doesn’t have to be like that. Integrating your first datasets into our semantic platform DISQOVER, happens in just 3 days. There’s no need for long trajectories! Within 3 days, your first internal datasets are searchable and you’re even trained to add additional datasets yourself.
Here’s how it works.
Human beings, and to a greater extent scientists, try to order items for a purpose. In biomedical sciences, probably no other subject is more diversely ordered or classified as human conditions or diseases – hereinafter abbreviated as ‘diseases’. None of these classifications claims to be the one and only truth or is able to serve all purposes. Instead, many classifications co-exist and are widely accepted – or enforced – by the various stakeholders in life sciences: lab scientists, clinicians, pharmaceutical and biotech firms, regulatory bodies, governments, etc. With different classifications come different definitions of terms even if they are highly similar or have exactly the same meaning.
But what about searching, comparing and analyzing similar data where different diseases classifications are applied? Or what about compiling data about diseases that are produced and maintained for a specific purpose?Read more