Don’t Let Data Bottlenecks Cripple your AI Solutions

The digital world is generating data at unimaginable rates. But AI workloads can overwhelm the capacities of legacy infrastructures.

istock 816675200

 A number of factors have aligned in recent years to move artificial intelligence (AI) out of the realm of research labs and into high-demand commercial products and services. Decades of developing and fine tuning AI algorithms and models have helped drive this transition, as has the rapid rise in computing power and cloud-based services.

And today, AI has a critical role to play in handling digital data—and lots of it. Whether used for training AI systems to perform their designated roles or for feeding those systems once in production, massive volumes of data are AI’s required fuel.

“Without data, getting the AI engine started is impossible,” the McKinsey Global Institute stated succinctly in its report, “Artificial Intelligence: The Next Digital Frontier.” 

“With the help of AI, our company can improve the customer experience, augment employee performance, automate work processes, and develop intelligent agents to help with a lot of repetitive business processes,” according to Evolution – 2018 Global Report: The Artificial Intelligence Imperative, which presents the findings from a survey of 2,300 global business and IT leaders by MIT Technology Review Insights, in association with Pure Storage.

On the face of it, this data requirement might not appear particularly problematic. After all, the digitization of everything from business operations to social interactions has generated data volumes unimaginable a decade or two ago. In recent years, the emergence of the Internet of Things (IoT) has turbocharged what was already a breakneck data growth rate.

In 2013, research firm IDC predicted that the total volume of digital data created worldwide would reach 4.4 zettabytes by 2020. Three years later, after factoring in the emerging IoT phenomenon, IDG predicted that the generation of digital data would hit a mind-boggling 180 zettabytes in 2025. (A zettabyte is equal to 1 trillion gigabytes.)

So the problem, obviously, isn’t a lack of digital data. Rather, it’s that AI systems demand data that is clean, accurate, and up-to-date, and they often require huge amounts of it near instantaneously. Many AI-based systems and techniques – from machine learning to natural language understanding to image and video analysis – use massively parallel operations, each of which requires its own high-volume data feed.

Legacy data storage devices and infrastructures simply aren’t up to the task of supplying the amount of data – much of it unstructured – that AI operations can consume. If the data infrastructure feeding those processes can’t keep up, data bottlenecks will occur. AI-generated  insights, warnings, and recommendations will then be slow to arrive.

Like AI itself, the data infrastructure supporting it must be highly parallel and extremely high speed. To address the requirements of machine learning and other AI and big-data analytics workloads, Pure Storage purpose-built its FlashBlade™ platform with massive parallelism at its core. The result: FlashBlade can provide up to 17 gigabytes of data per second and support tens of thousands of clients and tens of billions of objects and files.

AI is already proving its ability to deliver significant business benefits, including increased automation, reduced costs, critical insights, and competitive differentiation. These and other benefits can be constrained or even lost, however, if an organization tries to deploy cutting-edge AI solutions on an aging and inflexible data infrastructure.

For a closer look at Pure Storage’s FlashBlade platform, visit https://www.purestorage.com/products/flashblade.html.

Copyright © 2018 IDG Communications, Inc.