Along with social, mobile and cloud, analytics and associated data technologies have earned a place as one of the core disruptors of the digital age. 2016 saw big data technologies increasingly leveraged to power business intelligence. Here’s what 2017 holds in store for the data and analytics space.
John Schroeder, executive chairman and founder of MapR Technologies, predicts the following six trends will dominate data and analytics in 2017:
[ Keep up to date with the 10 hottest data analytics trends today (and 5 going cold). | Bolster your career with our guide to the big data certifications that will pay off. | Get the latest insights by signing up for our CIO newsletter. ]
- Artificial intelligence (AI) is back in vogue.
In the 1960s, Ray Solomonoff laid the foundations of a mathematical theory of AI, introducing universal Bayesian methods for inductive inference and prediction. In 1980 the First National Conference of the American Association for Artificial Intelligence (AAAI) was held at Stanford and marked the application of theories in software. AI is now back in mainstream discussions and the umbrella buzzword for machine intelligence, machine learning, neural networks and cognitive computing, Schroeder says. Why is AI a rejuvenated trend? Schroeder points to the three Vs often used to define big data: Velocity, Variety and Volume.
Platforms that can process the three Vs with modern and traditional processing models that scale horizontally provide 10-20X cost efficiency over traditional platforms, he says. Google has documented how simple algorithms executed frequently against large datasets yield better results than other approaches using smaller sets. Schroeder says we’ll see the highest value from applying AI to high volume repetitive tasks where consistency is more effective than gaining human intuitive oversight at the expense of human error and cost.
- Big data for governance or competitive advantage. In 2017, the governance vs. data value tug of war will be front and center, Schroeder says. Enterprises have a wealth of information about their customers and partners. Leading organizations will manage their data between regulated and non-regulated use cases. Regulated use cases data require governance; data quality and lineage so a regulatory body can report and track data through all transformations to originating source. Schroeder says this is mandatory and necessary but limiting for non-regulatory use cases like customer 360 or offer serving where higher cardinality, real-time and a mix of structured and unstructured yields more effective results.
- Companies focus on business- driven applications to avoid data lakes from becoming swamps. In 2017 organizations will shift from the “build it and they will come” data lake approach to a business-driven data approach, Schroeder says. Today’s world requires analytics and operational capabilities to address customers, process claims and interface to devices in real time at an individual level. For example, any ecommerce site must provide individualized recommendations and price checks in real time. Healthcare organizations must process valid claims and block fraudulent claims by combining analytics with operational systems. Media companies are now personalizing content served though set top boxes. Auto manufacturers and ride sharing companies are interoperating at scale with cars and the drivers. Delivering these use cases requires an agile platform that can provide both analytical and operational processing to increase value from additional use cases that span from back office analytics to front office operations. In 2017, Schroeder says, organizations will push aggressively beyond an “asking questions” approach and architect to drive initial and long term business value.
- Data agility separates winners and losers. Software development has become agile where DevOps provides continuous delivery, Schroeder says. In 2017, processing and analytic models will evolve to provide a similar level of agility as organizations realize data agility, the ability to understand data in context and take business action, is the source of competitive advantage not simply having a large data lake. The emergence of agile processing models will enable the same instance of data to support batch analytics, interactive analytics, global messaging, database and file-based models, he says. More agile analytic models are also enabled when a single instance of data can support a broader set of tools. The end result is an agile development and application platform that supports the broadest range of processing and analytic models.
- Blockchain transforms select financial service applications. In 2017, there will be select, transformational use cases in financial services that emerge with broad implications for the way data is stored and transactions processed, Schroeder says. Blockchain provides a global distributed ledger that changes the way data is stored and transactions are processed. The blockchain runs on computers distributed worldwide where the chains can be viewed by anyone. Transactions are stored in blocks where each block refers to the preceding block, blocks are timestamped storing the data in a form that cannot be altered. Hackers find it theoretically impossible to hack the blockchain since the world has view of the entire blockchain. Blockchain provides obvious efficiency for consumers. For example, customers won’t have to wait for that SWIFT transaction or worry about the impact of a central datacenter leak. For enterprises, blockchain presents a cost savings and opportunity for competitive advantage, Schroeder says.
- Machine learning maximizes microservices impact. This year we will see activity increase for the integration of machine learning and microservices, Schroeder says. Previously, microservices deployments have been focused on lightweight services and those that do incorporate machine learning have typically been limited to “fast data” integrations that were applied to narrow bands of streaming data. In 2017, we’ll see development shift to stateful applications that leverage big data, and the incorporation of machine learning approaches that use large of amounts of historical data to better understand the context of newly arriving streaming data.
Hadoop distribution vendor Hortonworks predicts:
- Intelligent networks lead to the rise of data clouds. As connections continue to evolve thanks to the Internet of Anything (IoAT) and machine-to-machine connectivity, silos of data will be replaced by clouds of data, Hortonworks says.
- Real-time machine learning and analytics at the edge. Smart devices will collaborate and analyze what one another is saying, Hortonworks says. Real time machine-learning algorithms within modern distributed data applications will come into play — algorithms that are able to adjudicate ‘peer-to-peer’ decisions in real time.
- More pre-emptive analytics: from post-event to real-time and pre-event analysis and action. We will begin to see a move from post-event and real-time to preemptive analytics that can drive transactions instead of just modifying or optimizing them, Hortonworks says. This will have a transformative impact on the ability of a data-centric business to identify new revenue streams, save costs and improve their customer intimacy.
- Ubiquity of connected modern data applications. For enterprises to succeed with data, apps and data need to be connected via a platform or framework, Hortonworks says. This is the foundation for the modern data application in 2017. Modern data applications are highly portable, containerized and connected. They will quickly replace vertically integrated monolithic software.
- Data will be everyone’s product. Data will become a product with value to buy, sell or lose, Hortonworks says. There will be new ways, new business models and new companies looking at how to monetize that asset.
DataStax, which develops and supports a commercial version of the open-source, Apache Cassandra NoSQL database, predicts:
- The emergence of the data engineer. The term, “data scientist,” will become less relevant, and will be replaced by “data engineers,” DataStax says. Data scientists focus on applying data science and analytic results to critical business issues. Data engineers, on the other hand, design, build and manage big data infrastructure. They focus on the architecture and keeping systems performing.
- Security: Growth of IoT leads to blurred lines. IoT’s growth has largely gone unchecked, DataStax says. With a lack of standards and an explosion of data, it isn’t entirely clear who is responsible for securing what. Most at risk are ISPs, which is why we’ll see these providers take a leading role in the security conversation in the year ahead, DataStax says.
- Hybrid wins, thanks to certain enterprise-ready cloud applications. It is becoming clear that many large organizations that have built their databases on legacy platforms would rather pull out their teeth than switch, DataStax says. Hybrid data architectures that encompass legacy databases, yet allow organizations to take advantage of cloud applications, will be a major focus for these organizations.
- Cutting ties thanks to serverless architectures. DataStax believes the move to serverless architectures — applications that depend on third-party applications or services in the cloud to manage server-side logic and state, or that run in stateless compute containers that are event-triggers — will become more widespread in the coming years. The adoption of serverless architectures will have a widespread impact on how applications are deployed and managed.