In an accounting or ERP system, most of the data must be right, period. If invoices, accounts receivable, inventories, work-in-process numbers, ECAD files are "best guesses," someone will lose their job and the CIO will be on the hot seat.
In this column, I've had many quotable quotes about the need for data quality in SFA and CRM systems. And here's another one: perfectionism doesn't pay. It's so important, it's trademarked.
CRM Data is Different
In a CRM system, there's a range of allowable (and even expected) data quality that depends on the specific data element. Some things do have to be perfect, such as unique keys, internal security information, order quotes, order history, and anything that's subject to an audit (such as PCI, HIPAA, FERPA, or other compliance standards). But other things can be just an approximation or can be missing altogether. Is it essential that the customer's customer service calling history from last year be perfectly represented in your CRM system? Not likely. So how do you decide where to make your data quality investment?
How to Do Data TriageThe first step is to have an analyst pull out the CRM data dictionary (or, more likely, create one) and separate the data elements into three categories:
(1) the ones that must be there and must be correct to prevent corruption in external systems or misrepresentation of the business,
(2) the ones that should be correct for the CRM system to work at all, and
(3) the ones that people have asked for to make marketing, sales, and customer support work better. The fun part comes in the next step. Do a quick data quality analysis on each data element in the three categories. Score the data quality by answering questions such as:
&bull: Does this data element have an undisputed owner? Is it updated by a team member as a natural step in a key business process? Or can nearly anyone update it at any time?
• Does this data element have internal validation to prevent noisy input? Does it have an audit trail to support troubleshooting?
• What percentage of the CRM records has this data element missing, clearly incorrect, or duplicate?
Based on the resulting score, the analyst may want to reassign the triage category for some of the data elements.
Most Expensive Data is Data You Don't Really Need.Every data element in a CRM system is in there because somebody asked for it. But requests are just ideas and good intentions: reality is different. The data analysis will discover data elements that are missing or wrong 40, 60, or even 90percent of the time. Scrutinize these, as they are unlikely to have much business value. (Watch out for the exceptional data element that only applies to your top 50 customers, but which is there for all 50 of them.) The kind of data that really would be good to have, but is rarely there:
• Customer purchase intentions
• Competitive information
• Win/loss analysis
• Customer loyalty surveys
In most cases, you can't afford to spend much on data quality here. It's too difficult to collect some of the data in the first place, and there are too many ways for the meaning of the data to be misinterpreted or misrepresented over time.
If the department requesting this data protests when you recommend expunging a data element from the system, borrow a tool from the CFO's bag of tricks: tell them you're happy to make this data perfect as long as all the costs for that work come out of their departmental budget. This ensuing discussion will quickly separate the wheat from the chaff.
Long Tail, or Wrong Tail?Once you've identified the data elements where you're going to invest in data quality, it's important to understand the business situations in which the data element is critical, and when it isn't. Phone numbers, for example, have to be right. But is the phone number of "free trial customers" as important as the phone number of a purchasing agent in your biggest customer? Have analysts identify the "tails" of data, along such lines as:
• Customer size, industry, location, or profile
• Order size, frequency, or recentcy
• Product line
• Interaction type (e.g., web visit vs in-person meeting)
Have the analyst create a set of business rules or filter criteria that characterize when it's worth chasing a data element's quality way down the statistical tail, and when it isn't. Make sure these distinctions are documented in your CRM's data dictionary so everyone understands the basis for your data quality investment decisions.
Six Sigma is Six Figures
OK, now for the hard part. In CRM systems, it gets exponentially more expensive to improve data quality. If it costs $X to get solid data quality on 68percent of your records, it'll probably cost $2X to get the data quality right on the next 17percent (the difference between one and two sigma), and $4X to get the quality right on the next 4 percent.
Fortunately for your budget, the business value of data also isn't absolute. For lots of business purposes, data within the last 3 years should be kept up to snuff, but the business value of data may decline rapidly after that. I can't think of a reason why data more than seven years old needs to be very accurate—even the IRS doesn't require records beyond that horizon.
Using these techniques, you can run the numbers to make good tradeoffs—investing big in data quality only when the payoff is there.
David Taber is the author of the new Prentice Hall book, "Salesforce.com Secrets of Success" and is the CEO of SalesLogistix, a certified Salesforce.com consultancy focused on business process improvement through use of CRM systems. SalesLogistix clients are in North America, Europe, Israel, and India, and David has over 25 years experience in high tech, including 10 years at the VP level or above.
Follow everything from CIO.com on Twitter @CIOonline