Thor Olavsrud
Senior Writer

How Capital One delivers data governance at scale

Jun 09, 20236 mins
Data GovernanceData Management

With hundreds of petabytes of data in operation, the bank has adopted a hybrid model and a ‘sloped governance’ framework to ensure its lines of business get the data they need in real-time.

Female Engineer Controller Observes Working of the System. In the Background People Working and Monitors Show Various Information.
Credit: Gorodenkoff / Shutterstock

The ever-increasing emphasis on data and analytics has organizations paying more attention to their data governance strategies these days, as a recent Gartner survey found that 63% of data and analytics leaders say their organizations are increasing investment in data governance.

The reason? Data governance is no longer viewed as a vehicle for compliance but as a driving force for ensuring the right, quality data is accessible to end users when and how they need it — a key factor in becoming a data-driven organization.

“In the long run, your costs are going to be lower, and your speed is going to be much faster,” says Naga Gurram, senior director of software engineering at Capital One.

But that all depends on a data governance strategy tuned for the digital era. After all, real-time data will only get you so far if slow, complicated data governance processes gum up the works.

“How can I enable my business user to get to the data they need, in real-time, at scale? If you missed that opportunity of providing the right product at the right time, we are not doing our job and we are losing the opportunity to better serve our customer,” Gurram says.

Going hybrid for data governance

Data governance is a system for defining who within an organization has authority and control over data assets and how those data assets may be used. It encompasses the people, processes, and technologies required to manage and protect data assets. Most companies already have some form of data governance that applies to individual applications, business units, or functions. The evolving data governance practice, though, is about establishing systematic, formal control over enterprise processes and responsibilities.

“It’s hard to even realize how much it’s going to help you,” Gurram says, adding that Capital One has evolved its data governance practice from its previous centralized model to better address a rapidly shifting data landscape.

“In our legacy world, we had a finite set of infrastructure, a finite set of data, and a finite set of users,” Gurram says. “And we used to manage our data governance centrally.”

Under that system, business units would come to the central team for all their data governance needs. The team would make sure all data ran through governance policies and that the business units were meeting all those policies.

“But there has been this whole explosion of data,” Gurram says. “We used to talk about terabytes of data, hundreds of terabytes of data. Now we are talking about hundreds of petabytes of data. Data is coming from everywhere.”

To cope with this explosion, Capital One has established a hybrid data governance practice, with a central enterprise data governance team and federated data governance teams embedded in its lines of business. The central team focuses on building data governance platforms and self-service tools used by the lines of business. It’s also tasked with maintaining the company’s data governance vision and championing a cultural shift in which data is no longer treated as data but as a product.

“All of our policies, all our platforms, all our tools are managed by a central team and built by a central team, but the execution of the data governance comes from federated teams,” Gurram says. “We give the right tools and the platform to our business partners, our lines of business, and they make sure they are getting the data, they are publishing the data based on these policies using a self-service tool.”

‘Sloped governance’ for the digital era

With the right tools and platforms in place, Capital One’s federated data governance teams can focus on providing services and policies tailored to the use cases and data specific to their lines of business.

But the strategy’s tailored approach doesn’t stop there, as Capital One takes what it calls a “sloped governance” approach, with varying levels of governance and controls around access and security depending on the data, Gurram says.

“You should give the flexibility to your partner teams so that they can apply the policies that they need to on behalf of the lines of business,” he says. “You should not have one set of rules for everyone. It’s not like one set of rules are applicable for each and every data set.”

Trying to force a single policy on your organization’s data is one of the things that leads to one of the more dreaded terms in data governance: “overhead.”

Data governance is often seen as a cost center — and thus as overhead. But Gurram stresses that when properly planned and implemented, the benefits far outstrip the costs.

“What I recommend for anyone going through this journey is don’t look at it as overhead,” he says. “Don’t look at it as a patchwork quilt. Don’t look at it as a project. Look at it holistically and focus on the outcome when you’re selling this to your business.”

The benefits of data governance include improved compliance with data regulations, yes, but also:

  • Better, more comprehensive decision support stemming from consistent, uniform data across the organization
  • Clear rules for changing processes and data that help the business and IT become more agile and scalable
  • Reduced costs in other areas of data management through the provision of central control mechanisms
  • Increased efficiency through the ability to reuse processes and data

If you’re working on implementing a data governance framework for your organization, Gurram says the best place to start is with a holistic approach focused on the goals you’re trying to achieve. Don’t try to fit your policies to the data that you have — that will lead to a patchwork quilt of fragmented policies.

“Don’t bring in lots of data and then try to figure it out and write some rules,” he says. “Build a vision and publish your data based on those policies. That’s much easier than doing this patchwork quilt.”

Gurram advises asking yourself: Do we have the right data platforms so we can implement the best data governance? Do we have the right tools to make it easier for our users? Do we have the right talent in place so we can build this seamlessly? Does everyone understand their roles and responsibilities?

“If you think about these questions and then come up with the strategy, it’s easy to implement,” Gurram says. “You’re going to build toward the end goal. If you don’t focus on the outcome, it will be very difficult to implement.”