Big Data has grown from a super-hyped term of the past few years to become a core part of how many businesses operate. Data analytics help determine everything from which delivery route is best to which shade of red to use in packaging, to which employees to hire and promote.
But as Big Data becomes an ever bigger part of how companies are managed, it presents a growing challenge to IT operations responsible for storing, processing, and delivering that data where and when it’s needed. One answer to that challenge is programmable infrastructure, such as software-defined networking (SDN). Here’s why:
Big Data needs robust infrastructure.
Most of the infrastructure discussion around Big Data thus far has centered on the storage issues it can raise. Though large amounts of storage seem to become more affordable by the second, the growing storage needs of Big Data have outpaced those dropping prices.
But the ability to move data is nearly as important, and as challenging, as the ability to store it. Data hosted in a cloud configuration, or across several clouds in an Intercloud, needs a fast and robust infrastructure so that it can flow freely as needed.
We need to process—not store.
Many organizations have approached Big Data with the assumption that if they retain data—as much data as they can come up with—it will prove to be useful in unexpected ways. That’s turned out to be perfectly true. Target never guessed when it created registries for expectant mothers that a few years later data analysts could use that information to identify pregnant women in their first trimesters.
But if “let’s keep it all!” has proved to be an effective approach to Big Data, it’s increasingly impossible to sustain. Enterprise IT departments are facing the fact that they need to put algorithms in place to determine which data to keep and which to discard, and do it in real time. That will require more powerful processing and faster infrastructure, and increasingly those elements will be as important as storage.
Big Data needs to work in real time.
While there are certainly situations in which thoughtful analysis after the fact is a valuable use of Big Data, increasingly data scientists are looking to put data into real-time use; for instance by analyzing incoming signals to quickly find the location of a breakdown on a factory floor—or identify customer-service problems in real time. Data doesn’t just need to be big; it also needs to be fast.
This is where programmable infrastructure can make a significant difference, because it allows IT organizations to easily give priority to essential data, automatically tune the network to avoid bottlenecks, and change rules on the fly as data and processing needs require. And when organizations are seeking to address every single customer complaint about a salad or other food item, anything that slows that data down is real problem.
IoE is upping the ante
With the arrival of the Internet of Everything, IT departments clinging to the old let’s-save-everything model will be forced to rethink those strategies. Interconnected devices, machine-to-machine (M2M) communications, and other elements of IoE will force the last holdouts to begin processing data in real time. In a world where people, processes, mobile devices, and machines are all connected, fast infrastructure, based on centralized programming will be an absolute necessity.
The reverse is also true. If Big Data depends on a robust and fast-moving network to be used to its full potential, it’s equally true that programmable infrastructure relies on Big Data to deliver its maximum benefits. Here’s why:
Optimum network performance requires data analysis in real time.
Getting the most benefit from a programmable infrastructure implementation requires data analytics throughout the entire process. Sophisticated analysis of how data is traveling through the network—in real time—helps software-defined networks tune themselves for optimum efficiency on the fly.
Analysis after the fact will help us build healthier networks.
In addition to helping programmable networks fulfill their promise of greater efficiency by analyzing traffic flow in real time, analysis of collected data on network traffic flow patterns is essential for further improving performance. IT organizations can use this information to write rules that optimize the network even more, squeezing the maximum possible benefit out of programmable infrastructure deployment.
…Until eventually the infrastructure can figure it out for itself.
The future promise of programmable infrastructure is that it will soon be able to program itself, requiring human operators to provide only to provide their intent for the network’s performance, identifying the data and processes that should jump to the head of the line, and which can take a bit longer—and then theautonomic network will make the necessary adjustments to deliver what the organization needs.
It’s a future that’s only possible as the network uses Big Data to discover how traffic flows, and a perfect example of programmable infrastructure and Big Data benefiting each other.