As the big data conference season winds down for 2016, I believe that one of the key trends we can see prevalent across the board is that the Apache Hadoop platform has made long strides in maturing in the enterprise.
Early on, Hadoop was adopted principally by those organizations that had “bleeding edge” use cases like sentiment analysis or needed predictive/prescriptive analytics on a massive scale. Today’s landscape looks notably different, with the average Hadoop cluster more likely to be augmenting an enterprise data warehouse than to be locked in a lab.
This year at Dell EMC World 2016, October 18–20 in Austin, and at the Dell EMC Customer Solution Centers, we’ll be talking to our customers about some of the key technology improvements that Hadoop has seen in the recent past. These enhancements help make Hadoop even more ready for primetime in line-of-business applications.
Hadoop Platform Matures
Let’s look at a few of the ways in which the Hadoop platform is maturing within the enterprise.
The infrastructure that makes up your average Hadoop cluster is deviating from the classic “lots of cheap disks” designs that we saw over the first decade of Hadoop’s emergence. With the workloads that Hadoop is responsible for becoming more key to delivering services, speed and throughput have become an area of investment. Across the board, flash memory prices have become so attractive that we’ve seen flash implemented initially as host-local scratch space and then on into tiered and even primary storage roles. Customers are also taking advantage of the increased physical density flash affords in order to hit node capacity requirements in an ever-decreasing footprint.
The network that links the nodes together is also going through massive changes. Initially implemented as a couple of bonded network interfaces optimized around cost-per-port, we’re now seeing customers team 25 / 40 / 100GbE connections to each data node in order to provide enough throughput to decrease data movement times as well as decreasing disk/node rebuild times during failure.
Taking A Simplified Approach
However, it’s not just customers looking for the fastest or coolest car on the block who are responsible for Hadoop’s maturity in the enterprise. Many customers are taking a vastly simplified approach toward adoption by repurposing their staffs to support the data and workloads on the front end rather than even worrying about speeds/feeds or slots/watts of infrastructure. Given limited personnel resources it’s better to keep the folks who know your business and data working at that tier and get your underlying infrastructure off their plates. Widespread adoption of pre-engineered and validated systems is key to getting this done.
Since customers vary in budget, complexity and scale, there are many offerings that cover the spectrum of needs, from providing a little bit of help (through a reference architecture) to providing a lot of help (through a comprehensive managed solution). Data analytics maturity only comes when your team can spend time clocking hours at the data layer, so reducing the time and risk in getting there is of paramount importance.
At Dell EMC’s Customer Solutions Centers, we can engage with customers to help demonstrate the value of the Dell Technologies portfolio. Whether through a simple briefing, a more complex white-boarding session, or even a fully-supported proof of concept, our team of solution experts builds our customers’ confidence in Dell EMC as the correct partner for them as they move toward becoming a mature data-driven company.
Kris Applegate is a cloud and big data solution architect for Dell EMC Customer Solution Centers.
Copyright © 2016 Dell Inc. or its subsidiaries. All Rights Reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be trademarks of their respective owners.