by Kumar Srivastava

Disaster recovery in the age of data and AI

Nov 10, 2017
Artificial IntelligenceBusiness ContinuityDisaster Recovery

To ensure that data is not lost, and outages are recovered from swiftly and efficiently, enterprises need to invest in high levels of redundancy in their infrastructure.

Data recovery on binary background
Credit: Thinkstock

As data becomes the only real competitive advantage feeding increased operational efficiencies, better customer intimacy and constantly improving customer experience, it is imperative that enterprises shift their disaster recovery efforts from just focusing on availability and reliability of services to ensure that their data assets are recoverable and re-integratable into various data powered scenarios backing their business.

Thinking beyond raw data DR

Modern enterprises require data in many shape and forms across the board for powering planning, ideating, experimenting and designing/developing new products and services. These business-critical scenarios are often dependent on data that has been transformed, processed and made suitable to meet the requirements. As the “distance” between raw data and transformed data that drives products and services increases due to increasingly complex techniques of transformation, disaster recovery needs to include the not just the time to bring up the copy of lost data online but the time it takes to retransform the data.

AI- and machine learning-trained models

AI techniques such as Machine Learning, NLP, Anomaly Detection etc. produce “models” that can be leveraged to drive predictions, classifications and categorization. These models are and will increasingly become more prevalent and leveraged in new products and services. The models will require disaster recovery to ensure that AI powered products and services continue to provide high levels of service.

AI and machine learning training data

A key part of the AI process is feature engineering and extraction from training data. This activity identifies aspects of data that can be used to build accurate AI models and subsequently prepares the data to be able to train and produce the models. This intermediate state of data is key to being able to quickly retrain raw data into AI models.

Search optimized data

Several digital products and service both end user and employee facing are driven by search technologies that enable fast discovery and usage of data. Building search based scenarios requires search indexes to be created and maintained. The time to build search indexes can vary and for large data sets, it can be non-trivial. Disaster recovery to ensure that these search-based scenarios continue to be powered require that these indexes be stored as part of the disaster recovery plan.


Analytics is another key transformed version of raw data that is critical to driving functions such as marketing, support, operations and enabling users of products and services. Analytics that aggregate and statistically analyze data also have a non-trivial computation cost at scale and thus are critical to a comprehensive disaster recovery strategy.

Cloud vs. on-premise disaster recovery

A key question to determine a comprehensive disaster recovery plan is the decision to have disaster recovery implemented on the cloud or on premises. Common concerns about data privacy, security continue to be relevant if not more critical when disaster recovery includes storage of higher value data (such as AI models) that contain enterprise IP and business secrets.

However, the economies of scale and the dedicated security and operations of large cloud providers can offer high levels of security and privacy for enterprises. However, enterprises concerned with security and privacy should look for disaster recovery solutions that provide encryption service to ensure that the data is always secured. Encryption is a good starting point even for data stored on premises but especially important for data stored in the cloud.

Hybrid cloud and API-first strategy

A good strategy for cost effective disaster recovery strategy is the hybrid cloud strategy. In this approach, data is stored on premises and in the cloud offering a higher level of redundancy of data and diversification of risk. Cloud based disaster recovery can be more cost effective, secure and hassle free. Public cloud providers such as AWS, GCP and Azure all provide disaster recovery options. More vendors are providing easier on-boarding, such as NetApp which now provides a path to disaster recovery on AWS and Azure for on-premise data centers at large organizations.

However, to ensure that the time to re-integrate data into applications, it is important that applications be designed as API-first i.e. such that during disaster recovery scenario, the service addresses and data addresses can be updated without a change required in the API layer leading to minimal service disruption and minimal changes required to applications to recover from outages.

Disaster recovery through redundancy

In a data-driven world, data infrastructure plays an incredibly critical role in driving business value, business continuity and user satisfaction. To ensure that data is not lost, and outages are recovered from swiftly and efficiently, enterprises need to invest in high levels of redundancy in their infrastructure with both compute and data infrastructure that is highly resilient to outages through multiple levels of redundancy in the infrastructure. Ensuring that both data and compute are extremely available and resilient is key to ensuring that disaster recovery is swift, efficient and transparent.