At this annual Gartner Business Intelligence (BI) and Analytics Summit, three themes resonated:
- Self-service analytics is white hot and growing while demand for traditional dashboard BI is in remission.
- BI on Big Data (i.e., Hadoop-based and outside of the data warehouse) is a dynamic new class of problem that requires a new class of solution.
- Today’s buyers are increasingly coming from the business side of the house and not from corporate IT, which is moving the center of gravity from the hub to the spokes.
So what do these three trends mean for corporate IT? For achieving a single version of the truth? For enterprise data lakes and for cost control?
In the self-service paradigm, “power users” trump portal users. Tools are analytic-centric rather than reporting-centric. Business discovery supersedes information delivery. Semantic layer-free data exploration and rapid prototyping are where the action is. According to Gartner, revenue growth for BI tools is slight and IT budgets are flat. Looking closer, however, traditional BI tools are approaching no-growth while growth for data visualization and business discovery tools is in the double digits. This bimodal scenario reflects the exploding number of data-savvy knowledge workers and the desire to eliminate the IT bottleneck.
Self-service analytic tools allow power users to quickly explore, blend and visualize data from disparate sources to produce new business insights and to validate business data requirements to support application development and data management. Consequently, business-owned and operated data islands are forming that include data from enterprise data warehouses and big external sources such as web logs, industry hubs, social media, sensors, et al.
At Gartner’s annual conference, “Big Data Discovery” was trumpeted as something new; however, I wonder if we’re talking about something new or whether we’re talking about analytics on Hadoop as opposed to a relational database management system (RDBMS). It seems to me that, while there are certainly uses cases dealing with billions of observations, the “Big Data” moniker more often points to the storage system rather than the data volume, velocity, and variety. In any case, there’s a hot sub-market of tools for this category, and clearly they’re focused on the business buyer as much or even more so than they are the IT buyer.
This brings me back to the question, what does this mean for corporate IT? For a single version of the truth? For enterprise data lakes? For cost control?
All things equal, self-service analytics on proliferated data islands leads to multiple versions of the truth and spend duplication. Hmm, haven’t we spent the last two decades working in the other direction with enterprise data warehouses? Is the pendulum swinging in the direction of analytics empowerment and reduced time-to-answer and away from cost control and data quality management?
This is the traditional struggle between centralization and decentralization, which is brought to our analytics niche by radical cost reduction of open source software and commodity-priced compute infrastructure. How can IT help its customers have their cake and eat it too?
Gartner thought leader and featured speaker, Frank Buytendijk, suggested that we look to the business model that cracked the code on optimizing the centralization versus decentralization trade-off; namely, franchising. In that model, the role of corporate is to drive universal standardization to bring down costs, accelerate time-to-market, and improve quality. Franchise owners run their own businesses based on standardized processes and corporate supply chain economies of scale. Think McDonald’s. In our domain, the McDonald’s mentality implies standardization of tools and enterprise licensing to drive down costs, tool-specific skilling to create larger pools of skilled workers to be shared across projects and centralized provisioning of compute infrastructure to save time and money. At a more nuanced layer, standardization re-usable algorithms, data quality management methods, shared data lineage repositories, and data-as-a-service data provisioning.
But let’s not forget corporate-owned and operated enterprise data lakes to support business users’ self-service BI. Hadoop’s schema-less write capabilities enable quick, cheap and very large scale data lakes that empower business’ self-service-inclined data geeks. The cost of these data lakes are low and the planning overhead is light. If Corporate IT doesn’t get there first, business units will — over and over again. So what are we waiting for? Let’s go!