Data analytics isn’t a destination. It’s a journey.
By Attila Finta
At Dell EMC, we advise our customers that data analytics isn’t a destination. It’s a journey that proceeds through a series of phases that build on each other. Or, to use another metaphor, we liken the adoption of data analytics in an enterprise to an evolutionary process, in which an organization begins with the launch of a limited environment that scales up and matures in a natural manner. Over time, what started out as a few narrowly targeted use cases gives way to the widespread use of analytics across an enterprise that grows increasingly data-driven.
That was certainly the experience that we had in the internal IT organization at the heritage Dell company, and now at Dell EMC. Early on, we put ourselves on the path to becoming a data-driven enterprise when we designed and deployed our first Apache™ Hadoop® cluster. Our first use case was basic. We eased the load on our enterprise data warehouse by archiving data to Hadoop.
From there, the word spread to our business partners. In the months that followed, various business analysts and technologists came to us to innovate new solutions in the Hadoop cluster, some of which spawned funded IT projects. Our data analytics environment has since evolved and scaled up to support dozens of use cases and hundreds of users.
Let’s look at a few of these business-driven use cases.
In an effort to cost-effectively store and analyze growing volumes of structured and semi-structured data, beginning in 2013, the Dell IT team integrated the Cloudera distribution of Hadoop into the Enterprise Business Intelligence ecosystem, guided by a Dell (today Dell EMC) reference architecture based on technologies from Dell, Intel and Cloudera. We began this effort with an initial and successful use case of optimizing ETL offloads, and then turned our focus to greatly accelerating database queries while reducing data storage costs. This effort has paid off handsomely, with data storage costs falling from $25,000 – $50,000 per terabyte to just $1,000 – $2,000. Watch the video.
The marketing team walked a similar path. They worked with Dell IT to develop a marketing analytics solution that uses predictive modeling to help our sales team leverage customer data to offer the right products and the best experiences to customers. To cost-effectively store and analyze the data, we built a data lake for analytics hosted on Intel-based Dell servers and a Cloudera distribution of Hadoop. This highly scalable platform helped reduce query response times from three weeks to just two hours, and cut operating expenses by $4 million in the first year alone. Read the case study.
As our environment evolved, we expanded access to our data lake with the addition of a self-service data preparation tool platformed on Hadoop. This provided business users and data scientists with a workbench with which they could bring new data into Hadoop on their own, integrate and massage it, and then actually perform analytics on it. People began beating a path to our door, and we soon had scores of users loading, analyzing and accessing their datasets.
With the announcement of the merger between Dell and EMC, our human resources teams joined us in our data analytics journey. They used the analytics workbench to consolidate HR data from the two companies in the data lake, and then began the process of rationalizing job titles, evaluating compensation models and looking for ways to optimize the combined Dell EMC workforce to increase productivity.
If you take a step back and look at the bigger picture, you can see that business users at Dell EMC are becoming ever-more sophisticated and creative in their use of data analytics. They no longer want to simply generate reports. They want to apply advanced data analytics operations to gain deeper insights — such as using techniques like nPath analysis to connect the dots in clickstream data and system log data and interactive Sankey visualizations to glean insights that would otherwise be impossible to see.
Our business users and web analytics teams are now expanding their use of data science tools with the use of machine learning algorithms that leverage a web-shopper’s “digital fingerprint” — such as IP address, browser and operating system — to better discern who the customer is, what the customer is likely to be interested in, based on data enrichment, the customer’s past visits and clickstream analysis. The system then personalizes the customer experience with tailored web pages that are generated in real time. This personalized customer experience is intended to drive a better customer experience on the front end while also optimizing campaign spend on the back end, and enabling more direct account-based marketing engagement.
This is all part of becoming a data-driven enterprise. This is a shift that doesn’t happen overnight with the implementation of a data analytics platform. It happens over time, as an organization progresses forward on its data analytics journey.
For a closer look at ways your organization can leverage your data to transform your business, visit
Attila Finta is the IT Director for Architecture, Enterprise Business Intelligence & Analytics at Dell Technologies.