This first series of articles describe foundational steps that enable agile data warehouse development. My prior articles published thus far describe how to:\n\nThe next focus for setting yourself up for a best in class agile data warehouse environment is to develop a robust data quality solution.\n\nAccording to TDWI, the cost of bad data is more than $600 billion annually in the U.S. There are many negative consequences of low data quality, including:\n\nAll too often, companies invest in a data warehouse, but a proactive data quality solution is an afterthought. Developing a well-planned and scalable data quality capability as part of your foundational work can go a long way in improving the quality of your data. If done well, it will also improve the business stakeholder confidence in your data.\n\nFirst of all, let\u2019s define data quality. Way back in 1996, when I was first developing data quality processes, it was simply defined as \u201cfitness for use\u201d, which is still an appropriate high level definition. In order for data to be \u201cfit for use,\u201d an organization will need to define what aspects are most important to them. Below is a quote from a prior co-worker, who has focused on all aspects of data quality throughout her career.\n\n\u201cData and information quality thinkers have adopted the word dimension to identify those aspects of data that can be measured and through which its quality can be quantified. While different experts have proposed different sets of data quality dimensions \u2026 almost all include some version of accuracy and validity, completeness, consistency, and currency or timeliness among them.\u201d\n\n-- Sebastian-Coleman, Laura . Measuring Data Quality for Ongoing Improvement: A Data Quality Assessment Framework\n\n Rather than trying to focus on every dimension, start by focusing on the basics of completeness and timeliness, then move on to validity and consistency. These four dimensions can truly enhance the quality of enterprise data as well as stakeholders\u2019 confidence in the data they consume.\n\nCompleteness is first and foremost. Stakeholders need to know that what\u2019s in the source is accounted for in the target. You can ensure completeness in a variety of ways. For example, a record-balancing capability that records a count at the end of one flow and at the beginning of another to ensure all records are accounted for. The ultimate goal is to validate that every record and its corresponding information from a source is handled appropriately during processing. This source-to-target validation should be monitored and reported to the organization\u2019s data consumers.\n\nTimeliness\u00a0should be a component of service-level agreements (SLAs) and identify such criteria as acceptable levels of data latency, frequency of data updates, and data availability. \u00a0Timeliness can then be measured against these defined SLAs and shared as part of the data quality metrics.\n\nValidity is a key data quality measure that indicates the \u201ccorrectness\u201d of the actual data content; for example, confirming that all the characters in a telephone number field are digits, not alphabetic characters. This is the concept that most data consumers think about when they envision data quality. Validity can be assessed through data profiling, data cleansing, and inline data quality checks that may perform comparisons of incoming values to expected values or to values defined within a stated range of acceptability. Alerts can be set, depending on the validity checks used. The results of the validity checks should be measured and shared as part of the data quality metrics.\n\nConsistency is crucial to continued consumer confidence. Once data quality metrics are being monitored and reported to the business stakeholders for completeness, timeliness, and validity, then consistency can be measured by assessing changes in these patterns over time. These results can be added to the data quality metrics reporting that is shared with business stakeholders.\n\nComplete transparency of data quality metrics and reporting to your organization\u2019s data consumers will lead to greater confidence in the quality of the underlying data.\n\nStakeholder confidence will continue to increase if you are able to proactively identify issues through active data quality monitoring before the data consumers find them. This is one of the greatest achievements of a robust data quality program.\n\nThe next article will cover the next step in building the foundational approach to agile data warehouse development: giving the development team the ability to self manage their agile development approach, incorporating continuous improvement.