There isn't a sales force in the world that says it has enough Leads. And you won't find many marketing VPs who want to do fewer campaigns. So \n\nthere's a never-ending stream of new leads, prospect interactions, and conversations to be stored in the CRM system. At companies in consumer \n\nmarkets, open source software, and other categories it's not unusual to find a million leads or more. But that's just the beginning: if you're using the latest \n\nmarketing automation system, every e-mail, web download, and prospect response is recorded in the CRM system. And if you have a large call center, \n\nevery call and e-mail exchange should be recorded well.[ For timely data center news and expert advice on data center strategy, see CIO.com's Data Center Drilldown section. ]This can mean millions of records, with thousands of new ones every day. No other system in most enterprises needs to deal with this kind of data \n\nflow, particularly with the number of simultaneous users that a CRM may have. If you're growing the scope of your current system, or consolidating \n\nseveral division-level CRM systems, here are things to consider as the data scales upward.CRM Definition and SolutionsPlatform PerformanceNearly any real CRM system will have reasonable performance for most use cases even with data sets of this size. But there are some features that \n\nwill inevitably start to slow down: reports, dashboards, or any functionality that involves scanning all records. While some of the problem can be solved \n\nwith the usual workarounds (de-normalizing data, creating analytic digests during off-hours, pre-joining views, etc.), anything that involves an ad-hoc \n\nselect or upsert can't fundamentally be helped without throwing hardware or coders at the problem.\n\nUser Interface PerformanceUnder most conditions, the responsiveness of the CRM user interface will be fine with large data sets because users are typically working with only \n\none record at a time. But if your CRM system runs inside a browser, watch out for UI features that loading of large lists for the user. Some systems may \n\ntry to load thousands of names into a scrolling list, with appalling performance consequences (particularly on Firefox).Third Party AppsWhile the core CRM platform will probably handle really big data sets, we've seen cases where supposedly enterprise-scalable third party apps \n\ncouldn't even load the data, let alone process it properly. The combinatorial explosion that occurs within some applications' algorithms can lead to \n\ncorner cases that are tough to anticipate. As always, run pilots with the complete data set before you put third-party apps into production.\n\n\n\nDuplicates and CorruptionBoth of these data quality issues can happen in any size company, but they are magnified with every new system that feeds into your CRM. Since \n\nthe really big data sets go hand in hand with external integrations, de-duping and data cleansing are an absolute must with large scale CRM. Of course \n\nyou'll be using tools on a regular basis to keep these issues at bay, but the tools will need to have several parametric controls and thresholds for their \n\nmatching and cleaning algorithms. Your administrative staff will need to develop specific sequences of cleaning and deduping passes, and maintain logs of \n\nthe settings for each pass. Over time, they'll discover patterns in data quality problems that will provide clues about the underlying causes and possible \n\nfixes in the CRM system.Backup and Archive PolicyMost CRM systems can be configured to run an online backup, and most SaaS CRM systems do this for you. But recovering data from a purely \n\nincremental backup is in no way straightforward, and can be quite the project. So having a periodic baseline backup of your own is really a necessity \n\nfor a CRM system of any size.The problem, of course, is how much data you can pull out of the system during the backup window. For almost any company, the CRM system \n\ncan be quiesced for several hours each weekend. So you should be able to pull dozens of gigabytes out of the system, more than sufficient for all the \n\nparametric records. But you won't have enough time to pull over all the attached documents and e-mail threads. Consequently, you'll need to develop a \n\nbackup data partitioning policy that fits with the way your business works. In addition, you'll need to develop a detailed policy about when and how you move data into archival storage. I have yet to find a CRM system that \n\nknows what to do with nearline or offline storage, and your archival policy will need to take analytics tools into account as well. As a matter of policy, I \n\ncan't think of a reason to keep CRM data (even summaries) on line for more than 7 years. But I can think of lots of good reasons to keep at least the \n\nlast 2 years worth of data transparently available, no matter what your industry. Although these may seem conceptually simple issues, your business \n\nrequirements can involve a surprising amount of complexity to enforce application-level data integrity, so put some smart business analyst on this \n\nproblem for a while before you ask for recommendations.David Taber is the author of the new Prentice Hall book, "Salesforce.com Secrets of Success" and is the CEO of SalesLogistix, a certified Salesforce.com consultancy focused on business process improvement through use of CRM systems. \n\nSalesLogistix clients are in North America, Europe, Israel, and India, and David has over 25 years experience in high tech, including 10 years at the VP \n\nlevel or above.Follow everything from CIO.com on Twitter @CIOonline.