Are your digital transformation initiatives putting your organization into data debt? Data analytics startup Dremio, still in stealth mode, this week released a free Big Data Debt Calculator to help you estimate the unplanned costs that stem from the use of non-relational data management technologies.
The data debt problem has grown as organizations undergo digital transformation initiatives, embracing non-relational technologies that help them become more agile and deliver always-on availability to their users. Software-as-a-service (SaaS) applications, cloud database offerings, NoSQL and related technologies are providing a new way to create or deliver next-generation applications, but while they are reducing time-to-market, the critical business data they systems create are largely incompatible with traditional analytical workflows.
These new applications are helping organizations address strategic business initiatives, but they also create the need for new tools.
“All data has a lifecycle, and for almost four decades, that lifecycle has been based on the relational model,” said Tomer Shiran, Dremio CEO and co-founder. “The data, the tools for creating and analyzing the data, and the skills for mastering the data have all shared and benefited from a common approach.”
“As the big data era emerged, new technologies were created to support modern data structures and to deliver to the always-connected user,” he added. “These newer systems have an important role in building modern applications, however they produce data that is fundamentally incompatible with existing analytical infrastructures including data warehouses, ETL, BI and data science systems like R and Python. As a result, many organizations are collecting significant data debt.”
Dremio’s new Big Data Debt Calculator is intended to help organizations get their arms around this unplanned debt. Dremio says it gives recommendations for minimizing debt, strategies for paying it down and ensuring it remains within acceptable bounds.
The calculator takes into account four inputs:
- The amount of source data, in terabytes, stored in non-relational systems including Hadoop, NoSQL, Amazon S3 and third-party applications
- The number of source systems (not servers; a 50 node Hadoop cluster counts as one system)
- The number of data analysts that use the data
- The number of data scientists that use the data
The calculator also takes two other factors into account:
- Liability costs. Because big data often involves tools and protocols that are less mature than traditional approaches, Dremio says these systems pose a greater liability cost that must be considered. The costs include potential losses resulting from unsecured or ungoverned data moving through pipelines to make it compatible with the tools used by analysts and data scientists.
- Opportunity costs. Dremio notes that moving application data into analytical environments can take a significant amount of time, and reducing this time can have very high costs. Some of the opportunity costs that may be incurred include unrealized value as a result of prolonged time to insight as data moves through pipelines to reach the tools used by analysts and data scientists.
The calculator uses these inputs to estimate your technology costs, people costs and total big data debt.