How to Tell If Your Organization Is Operating on Bad Data
Most organizations have no idea their data is bad. The reports run. The dashboards populate. The numbers look authoritative. The audit passes. The funder reports get accepted. The strategic plans get built on financial projections that look reasonable. Everything operating from the data appears to be working, which is exactly why the data quality issues stay invisible. Bad data doesn't announce itself. It produces outputs that look correct, get acted on as if they were correct, and generate consequences that may not surface for months or years. The recognition that the data was wrong usually comes after a triggering event, and by then, the organization has been making decisions on bad data for so long that the cumulative consequence is significant and difficult to unwind.
The diagnostic question isn't whether your organization has bad data. Most organizations of meaningful size do, in some specific patterns that are predictable. The diagnostic question is whether your leadership team can recognize the signs of bad data before they're forced to recognize them. Here are the indicators that consistently surface bad data conditions in nonprofit and public-sector organizations.
The first indicator is when the same question, asked of different parts of the organization, produces different answers. If you ask the finance team about the program's financial performance, you get one number. If you ask the program team, you get a different number. If you ask the development team about constituent counts, the numbers don't match the program team's numbers. If you ask the operations team about volume metrics, the numbers don't reconcile to what finance is reporting. The inconsistencies aren't usually large in any single instance. They're systematic across the organization. Every part of the organization has its own numbers that don't quite reconcile with the others' numbers. Leadership has learned to live with the inconsistencies and to make decisions despite them. The inconsistencies are telling you that the data infrastructure isn't producing a single source of truth, that different parts of the organization are operating on different versions of the data, and that the version any specific decision rests on may or may not be the right one for that decision.
The second indicator is when significant decisions require custom analytical work to produce the intelligence the decision needs. If leadership can't make decisions from standard reports, if every meaningful question generates a request for special analysis, if the analytical work takes days or weeks to assemble, the data infrastructure isn't supporting the questions being asked. Sound data infrastructure produces decision-ready intelligence on demand for the questions decision-makers actually ask. When the infrastructure requires assembly work for routine strategic questions, the assembly is filling a gap the infrastructure should be filling. The data may be technically present in the systems, but it's not in usable form for the work the organization is doing. The custom analysis is the symptom of structural data inadequacy.
The third indicator is when the team responsible for producing reports spends substantial effort cleaning, reconciling, and adjusting data before reports can be produced. If the monthly close, the quarterly review, the board package, or the annual report involves significant data preparation work beyond running standard processes, the data is bad. Sound data conditions allow standard processes to produce standard reports without significant intervening work. When the team is regularly cleaning vendor master data before AP reports run, reconciling general ledger balances before financial statements close, adjusting program data before performance reports go out, or correcting employee records before payroll reports get filed, the cleaning work is a continuous response to data quality issues. The work isn't a one-time data cleanup. It's the ongoing tax the organization pays for operating with data quality issues that haven't been structurally resolved.
The fourth indicator is when historical data analysis produces results that contradict known operational reality. If the analytical work produces conclusions that don't match what people who actually know the operations would expect, the data is suspect. The data might be saying that program A grew during a period when everyone in the organization knows program A was struggling. The data might show financial performance patterns that contradict what the people running the operations experienced. The data might suggest constituent or beneficiary trends that don't match the operational reality. When the data and the operational knowledge diverge, the divergence is usually pointing at data quality issues. The data has been assembled or classified in ways that produce technically correct outputs that don't reflect what was actually happening. The recognition often comes when leaders push back on analysis that contradicts their direct experience, and the analytical team can't fully reconcile what the data is saying with what leadership knows.
The fifth indicator is when funder or regulator requests for specific data require manual assembly the standard reporting can't produce. If a federal program officer asks for a specific cost breakdown and the response takes weeks, if a funder requests outcome data in a specific cut and the team has to build it manually, if a regulator asks for a transaction analysis and the systems can't produce it cleanly, the data infrastructure is inadequate to what the external environment is requiring. The manual assembly produces what's needed, eventually. The need for manual assembly tells you that the data is structured in ways that satisfy historical internal needs and don't satisfy current external requirements. The gap between what the systems produce automatically and what stakeholders are requiring will continue to widen as funder and regulator data expectations evolve, and the manual assembly burden will continue to grow until structural intervention happens.
The sixth indicator is when employees who depend on the data have built informal documentation explaining how the data actually works. If there are spreadsheets, notes, training documents, or institutional knowledge that explains the gotchas, the exceptions, the historical reasons for current data patterns, or the workarounds for known data issues, the data is bad in a specific way. The informal documentation is a knowledge layer that compensates for data infrastructure that doesn't document itself. New employees learn the formal systems and then have to learn the informal layer to actually do their work. The informal layer represents accumulated organizational knowledge about data limitations, exceptions, and patterns that the formal systems don't surface. The existence of the informal layer is evidence that the formal systems aren't producing what the work requires, and the informal layer is heroically filling the gap.
The seventh indicator is when the leadership team's confidence in specific categories of data varies based on who produced the analysis. If certain numbers are trusted because a specific person ran them, if the credibility of analytical work depends on the analyst rather than on the underlying data, if leadership has learned which sources to trust and which to question, the data infrastructure isn't producing a consistent quality standard. Sound data infrastructure produces analysis whose credibility comes from the data itself, not from the analyst's reputation. When the analyst's identity is the credibility marker, the data isn't standing on its own. Leadership has learned to navigate this through trusted relationships, but the navigation is compensation for data conditions that should produce credibility structurally.
The cumulative pattern across these indicators is consistent. If your organization is showing multiple indicators, the data is bad in ways that are affecting decisions, even when the decisions appear to be working. The appearance of working comes from the cumulative compensation that finance teams, program teams, analytical staff, and leadership have built up to operate around the data limitations. The compensation produces operational outputs. It also masks the underlying condition, which is that significant decisions are being made on data that wouldn't survive structured examination.
The cost of operating on bad data shows up in specific ways, and most of it is invisible. Decisions that wouldn't have been made if the data had been right. Strategic moves that didn't account for what the data should have surfaced. Investments that produced returns below what was projected because the projections were built on data that didn't reflect actual conditions. Programs continued or discontinued for reasons that weren't actually true at the data level. Compliance positions taken that didn't account for risks the data couldn't surface. The cost compounds across decisions, year after year, and the organization never sees it cleanly because the cost is distributed across so many specific decisions that no one of them carries the recognition.
The diagnostic that exposes this clearly is to examine, with operational specificity, what would have to be true about your data for your most significant decisions over the past two years to have been made on solid ground. Most leadership teams, doing this examination honestly, can identify multiple decisions where the data underneath them was substantially weaker than the decision frame suggested. The recognition is the precondition for addressing the structural condition that's producing the bad data.
The organizations that operate on good data have done specific structural work. They've established single sources of truth for key data domains. They've documented and addressed the data quality issues that compromise decision-making. They've invested in master data management, data governance, and the infrastructure that produces consistent data quality across the organization. They've eliminated the conditions that force shadow spreadsheets, manual assembly, and informal compensation. The work is unglamorous, expensive, and slow. It's also what produces decision intelligence the organization can actually trust, and the decisions made on trustworthy data are qualitatively better than the decisions made on data the organization has learned to navigate around.
If your organization is showing multiple indicators of bad data, the bad data is operating right now in decisions the organization is making. The compensation patterns that have been masking it are real, sustained, and significant, and they're consuming capacity that should be producing strategic value. The structural intervention to address the underlying conditions is significant work. It's also the only intervention that actually changes what the organization is operating on, and continuing to operate on bad data while compensating around it produces a cumulative cost the organization keeps absorbing without recognition.
This is what we identify and fix in the Strategic Assessment.