Skip to main content

What about data quality?

Data quality isn’t a yes/no property of a dataset. It’s contextual, purpose-driven, and dynamic over time. The key question is not “is the data correct?” but “is this data good enough for this decision, right now, at this cost?”

Executive summary

Data quality, by itself, has never blocked us from delivering measurable business improvements. We work with the data you have, model uncertainty, and add human-in-the-loop where it matters.

How we handle imperfect data

  • We design models to tolerate uncertainty and use human review where needed.
  • Knowing that data quality is low in some aspects is itself valuable information — we can encode that information in the models.
  • Organizations often overestimate the barrier of “bad data.” Progress is possible while quality improves in parallel.

Key takeaway: Don’t wait for perfect data. Start delivering value, manage quality explicitly, and improve iteratively.

Quality attributes of digital representations

Data quality has many dimensions:

AttributeWhat it meansWhy it matters
ReliabilityDo repeated measurements under similar conditions give similar results?Without this, trends are unstable.
ValidityDoes the data actually measure what’s needed for the decision?Wrong signals → wrong decisions.
RecencyHow fresh is the data, and how often is it updated?Old data can be inacurate for certain use cases and misrepresent reality.
Completeness & coverageWhat’s included, what’s missing?Gaps can bias results and you can loose traceability.
ConsistencyAre there contradictions across sources or over time?Inconsistencies undermine trust.
Accuracy & precisionBias: how far from the true value on average. Precision: how much repeated values vary.Determines how close to “truth” your model runs.
ResolutionEnough measurement points to capture dynamics of the variable?Low resolution hides peaks and fluctuations.
Lineage & provenanceWhere did the data come from and how was it transformed?Needed for trust, audits, and debugging.
InterpretabilityAre meaning, units, and context clear?Prevents misinterpretation and misuse.

No dataset is strong on all attributes. What matters is knowing where it’s strong and where it’s weak.

If quality is insufficient, act iteratively

  • Narrow or rephrase the decision/question.
  • Add lightweight validations at ingestion and transformation steps.
  • Provide feedback loops to data producers.
  • Insert human verification where impact is highest.
  • Enrich or cross-check with external sources.
  • Track simple improvement metrics and raise thresholds over time.

When “perfect” is the enemy

Example: Patient admissions at a hospital.

Imagine you’re looking at patient admissions at a hospital. The data might be incomplete (not all clinics report properly), and the resolution might be low (only daily totals, not hourly). At the same time, the data could be quite reliable (the counts that do exist are consistent) and recent (updated every day).

Even if it isn’t perfect, this combination can still be useful. Daily totals are enough to see whether demand for hospital care is rising or falling over weeks and months. You might not have the detail to plan staffing hour by hour, but you can still make a solid decision about whether to expand capacity overall or allocate more resources to certain departments.

At the same time, this kind of data says nothing about peak load. If you only see daily totals, you can’t tell whether most patients arrive in the morning, during the night, or all at once after a local event. Using such data to schedule shifts or manage emergency readiness would be risky.

Key takeaway: Imperfect data can still support solid decisions. Just don’t stretch it beyond what it can safely answer.

Final thought

Data quality is rarely perfect, and it doesn’t need to be. See it as a set of attributes to balance. The goal isn’t to chase perfection but to make decisions with eyes open — using the data you have, while steadily improving it.