KeyBank calculates the cost of analytics in the cloud

With its on-premises capacity filling up, KeyBank is pivoting its analytics efforts to the cloud — a shift that will mean big changes to how its users work, given the variable costs of queries and egress.

KeyBank calculates the cost of analytics in the cloud
Getty Images

With its on-premises analytics infrastructure hitting capacity, Cleveland, Ohio-based KeyBank has turned to the cloud, a move the large regional bank believes will provide clear performance benefits and likely cost savings but one that will require rethinking how the company trains and manages its users.

The bank processes about 4 billion records each night. Data is loaded into a Hadoop data lake and is then pushed down to more than 40 downstream systems, including 10 to 12 data marts used by Teradata. "It's a conventional on-prem architecture that would be current today," says Mike Onders, chief data officer, divisional CIO, and head of enterprise architecture at KeyBank. "We have over a petabyte of data in the Hadoop data lake environment and over 30 terabytes in the Teradata environment."

The system, which serves 400 SAS and Teradata users and 4,000 Tableau users, works well, but a little more than a year ago KeyBank’s Teradata appliances started reaching capacity.

"The engineered hardware itself still does what it was supposed to do: high-performance analytics," Onders says. "But in an on-prem architecture, you govern capacity. You're holding capacity steady and so performance will vary based on the loads on the box." For KeyBank this meant performance and queuing issues when running month-end and quarter-end jobs.

Moreover, Onders’ team projected that KeyBank would need to refresh its Teradata environment in 2021 — an inevitability KeyBank wanted to avoid. That's when Onders and his team decided to explore whether moving the bank’s analytics to the cloud would be a better choice.

To continue reading this article register now

7 secrets of successful remote IT teams