Embedded data Quality Control

Controlling data quality is an integral part of information security. DISQOVER is equipped with advanced features that support data QC and data processing audits.

Data QC during Data Ingestion

The data ingestion engine has powerful built-in tools for configuring a process-oriented data quality control, identifying different stages of data QC activities:

  • Incoming QC: verify the shape of the original data as provided by the data sources.
  • In-Process QC: verify the shape of intermediate data created during data ingestion.
  • Outgoing QC: verify the shape of the outgoing data produced by the data ingestion pipeline, and it’s readiness ready for production serving.
  • Final QC: verify the shape of the data server by DISQOVER, using the product’s API.


Tolerance based data QC

Real-world data is often imperfect and incomplete, and you may not have the ability to bring your data sources to perfection. Therefore, in DISQOVER you can define tolerance-based data quality checks, with the ability to set warning and error thresholds. For example, you can require a field to be filled in at 99.9% of the records. This way, you can ingest imperfect data without imposing a rigid schema, while still retaining control over the overall quality.

Traceable Data Ingestion process

The data ingestion engine traces dependencies throughout the visual pipeline, keeping track of what data sources are used as input to produce each individual data field in the DISQOVER data. This traceability facilitates a high degree of transparency, allowing security experts and auditors to assess the information flow, and verify the origin of each piece of information served by DISQOVER.

