Release Notes DISQOVER 5.10

Release Notes DISQOVER 5.10

19 September 2019 | version 5.10

This document contains the release notes of the DISQOVER version 5.10 and instructions to upgrade from version 5.00.x to this new version. Please make sure you have read these before updating or installing this new release.

New features :

  • The Analytics Dashboard is extended with features of the Search Dashboard. It integrates the functionality of the classical Search Dashboard and the Analytics Dashboard into a single environment, avoiding the need to swap from one to another. Using a single dashboard, you can now refine your search using filters as well as see graphical representations. There is a transitional setting that allows you to continue to use the Search Dashboard as it was in version 5.00.

  • Data exports run as an asynchronous process to enable extensive data downloads and can be saved in the .xlsx Excel file format too.

  • New visualizations are added: Clustered bar chart, Instance pivot table, Stacked time periods, and Stacked cards view.

  • An editable text search widget is available in the Analytics Dashboard. This allows the user to visualize and modify the used search string at every step during a linked data search.

  • A new widget allows visualizing aggregated values such as minimum, maximum, total and average on numerical data.

  • Dashboard template buttons allow the user to easily open additional dashboards for specialized views on the current data.

  • The Data Ingestion Engine is capable of exporting integrated data directly in RDF format.

  • The DISQOVER platform is now available for installation in the cloud or on-premise as a dockerized system. In addition, an Amazon Machine Image (AMI) is available to deploy a dockerized DISQOVER swiftly on Amazon Web Services (AWS).

 

Improvements :

  • Link-out buttons can be configured to send a POST request to an external web application such as an analytics or visualization service. For example, a set of property values or the location of raw data files can be packaged and sent for downstream processing.

  • Improved data importing in the Data Ingestion Engine allows the user to automatically scan and inspect imported files efficiently and to suggest usable predicates, classes and relations in RDF data.

  • In case more than 4,000 instance results are available, the ‘Please refine filtering under 4000 results to get links’ restriction is lifted and one can now follow the links of the first 4,000 instance results. A clear indication marks that the number of linked instances has been restricted.

  • The layout of a dashboard can be fixed to keep the placement of the dashboard components stable when the screen or browser size or resolution changes.

  • In addition, the dashboard can be locked to prevent the user from altering the dashboard

  • The administrator panel is expanded with a section for managing essential server settings.

  • Property values with a size larger than 32,000 can now be stored.
  • The performance and resource needs of the Data Ingestion Engine have been improved:

    • Federation synchronization results are cached, resulting in significant speed improvements after the

      first execution;

    • up to 30% faster execution of complete pipelines compared to version 5.00;

    • up to 50% less peak memory consumed to run the data publishing step compared to version 5.00;

    • data lookups of intermediate pipeline data run faster and provide more details;

    • pipeline verifications run more efficient.

  • Counts in facets retrieved by federation are more accurate when federation runs in fast mode due to increased subsampling.

Notes:

  • Data downloads only work for data that are exclusively local or remote (federated), and not for mixed instances.

  • In case an instance belongs to multiple canonical types (e.g., a molecule that is also an active substance), the properties specific for each data type are not visualized in a single detailed view anymore. Instead, separate dashboard templates and detailed views can be created per canonical type.

  • The functionality of the ‘Search strategy’ view is ported to a ‘Search history’ widget and button. Currently, a forked search cannot be represented. Instead, the different dashboards are chronologically listed in the left pane on the Analytics Dashboard.

 

Known issues :

  • In federation mode, when the data source highlighting is switched on, the indication if a source is local or remote is not shown.

 

2. Key features

2.1 Extended Analytics Dashboard

The new Analytics Dashboard replaces the existing Search Dashboard and Visual Analytics Dashboard. It combines searching, filtering and navigating with analytics features (see Figure 1). Dashboards can be customized via the frontend for specific user roles or use cases and saved as system-wide dashboard templates by data scientist users. Likewise, individual users can personalize their dashboards and create additional personal templates. This replaces the usage of configuration files to configure the layout of Search Dashboards.

DISQOVER can be deployed in regular mode with only the new Analytics Dashboard available or in compatibility mode where the classical Search Dashboard is the default dashboard with a button to open the Analytics Dashboard.

Figure 1: Analytics dashboard for the canonical type ‘Clinical study’ on the public version of DISQOVER.

Figure 1: Analytics dashboard for the canonical type ‘Clinical study’ on the public version of DISQOVER.

2.2 New visualizations

Four new visualizations are available from this version on. The ‘Clustered bar chart’ can be used to represent the number of instances for a categorical data facet in a bar chart and use a second categorical data facet to split and cluster the counts additionally (see Figure 2).

Figure 2: An example of a Clustered bar chart showing the number of clinical studies per study type and clustered per year.

Figure 2: An example of a Clustered bar chart showing the number of clinical studies per study type and clustered per year.

The ‘Instance pivot table’ can be used to show the distribution of facet values per individual search result instance (see Figure 3).

Figure 3: An Instance pivot table showing the countries where clinical studies are conducted.

Figure 3: An Instance pivot table showing the countries where clinical studies are conducted.

A third new visualization is intended to be used for mapping time intervals of instances on a time axis where additional facets can be used for stacked grouping and color coding. ‘Stacked time periods’ are ideal for representing information of canonical types that contain different time points that can be mapped as a duration, such as Clinical Study, Publication, and Project (see Figure 4).

Figure 4: An example of a Stacked time period of a selection of clinical studies.

Figure 4: An example of a Stacked time period of a selection of clinical studies.

Finally, the ‘Stacked cards view’ allows combining the visualization of different facets for a set of result instances where the results are represented as cards in a multi-column layout (see Figure 5).

Figure 5: An example of a Stacked cards view of a selection of Active Substances split into columns per drug carrier and color-coded by their known targets.

Figure 5: An example of a Stacked cards view of a selection of Active Substances split into columns per drug carrier and color-coded by
their known targets.

 

2.3 Data retrieval and downstream connectivity

The export functionality is extended with native Excel as an additional download format. Additionally, exports are running in the background as an asynchronous process and the status is indicated in a notification center. This enables performing large scale downloads. Extensive downloads can continue to run even when a user is logged out.

Figure 6: The notification center in the upper right corner of the application indicating the status of a data download request.

Next to the improved capabilities for data retrieval by an end user, also a more detailed configuration of link-outs to downstream applications has been added. A selection of data properties can be packaged as a pre-configured POST request that can be triggered by clicking a button on the dashboard. If desirable, the connectivity with another application can be further tuned by using an intermediary lightweight web service such as node.js.

 

2.4 Data Ingestion Engine Improvements

The Data Ingestion Engine was first release in DISQOVER 5.00 and underwent a number of improvements that boosts the performance of the modelling and debugging of a new data ingestion pipeline but also the execution of an existing pipeline. Integrating rich data already semantified is much more straightforward due to efficient scanning and inspection capabilities of RDF formatted data. Subsequently, this is applied to make suggestions about usable predicates, classes and relations. In general, the engine executes processes more quickly and with a lower peak memory consumption than before.

Since a data ingestion pipeline can directly output integrated data in RDF format, the engine can be used as a standalone integration tool independent of the DISQOVER user interface.

 

2.5 Easier installation, upgrading and migration

The software components of a DISQOVER setup are packaged as Docker containers to allow deployment with less restrictions on operating system related prerequisites. This eases the installation process in an existing software and network ecosystem where system and security monitoring are active. It also simplifies software upgrading, resource management or server migration, especially since server configuration settings and the user specific data are centrally stored and thus easy to backup and migrate.

 

3. Upgrade instructions

Upgrading to version 5.10 is possible via deploying Docker containers using an automatic installation script or via running an AMI with the prepackaged docker containers. More details are available in the installation manual or via support at support@ontoforce.com .

Try the free Community Edition or upgrade to DISQOVER 6.00 Enterprise

Experience the DISQOVER 6.00 Community Edition right now:

  • Create a free account
  • Enjoy unlimited action to public data
  • Access ~150 data sources
  • Create your own dashboards and share them with peers

Contact us to unlock the full DISQOVER experience with the ability to link internal and third-party data sources to create a truly data ecosystem. 

Try the free Community Edition