When AI/ML ROI Is Trapped in the System, Address “Flow”

BrandPost By Bart Driscoll
Jan 28, 2020
AnalyticsBig DataHadoop

Optimize user experience to accelerate innovation

cq5dam web 1280 1280cwyv5jx4
Credit: Dell Technologies

I read an interesting market statistic recently from VentureBeats AI that stated “87% of data science projects never make it into production”. This was later validated by Gartner who added that “80% of AI projects will remain alchemy [projects], run by wizards”. I appreciate that as an industry we are still very early in our maturity with respect to  data science/AI/ML lifecycle management, that AI/ML engineering practices are nascent, and that the tooling ecosystem is still emerging and expanding, but I have to wonder how much longer employers will tolerate investing in technology, tools, and people with little to no return. I also wonder how much longer data scientist will tolerate building models that never get used. There is clearly an issue with the way we develop, publish, support and maintain AI/ML solutions and the issue is impeding ROI. To accelerate ROI in AI/ML initiatives, I suggest we look at Flow.

Flow, is the experience that we have while following a process. For this article, let’s look specifically at the documented and undocumented processes that governs and manage how concepts are translated into AI/ML models … and then how they are ultimately transformed into application code, where those concepts then create value for the enterprise.

Personas and the Data Analytics Value Cycle

To address flow, we need to first identify and understand the key personas (or  stakeholders) as well as their role and function in the development process.  In most organizations, there are at least four key personas:

  • Business Product Owner: Identifies the use cases and develops initial hypothesis and acceptance criteria.
  • Data Engineer: Surfaces trusted data consistently and reliably so that data scientist and application developers can build the solution to test the hypothesis in the market.
  • Data Scientist: Designs, builds, and packages inferences and models that enable hypothesis market testing.
  • Application Developer: Embeds and deploys the analytics solutions via an application so that the end users/customers can interact the model, the system can generate and collect data for future training, and the initial hypothesis can be validated.

As evidenced by the varied role descriptions, each of these personas are critical to complete one successful evolution of the analytics value cycle. As described by Rob Small, in his blog “Accelerating the Analytics Value Cycle to Drive Tangible Business Outcomes”, the analytics value cycle consists of three major technical components: data services, model development, and integrated applications. 

Expanding on the scope of the analytics value cycle, I would add a 4th non-technical component” backlog development and prioritization. Backlog development and prioritization is the process to identify market opportunities, define hypotheses, and then prioritize the hypotheses and work to be done.

 Throughout the data analytics value cycle, the Data Scientist is instrumental since their role will help:

  • determine the scope and risk of the hypothesis during backlog development
  • define the data requirements for engineers building data services
  • lead the model development step
  • provide knowledge transfer, support, and enablement to development teams that are consuming their model

To unlock ROI in the analytics value cycle, we will need to evaluate and analyze the data scientist experience as they traverse the process. We will need to identify opportunities to optimize or automate process, to remove or reduce redundancies and wait queues, and to invest in tools and technologies that will accelerate cycle time. In short, to deliver ROI in AI/ML initiatives, we need to actively and intentionally discover and fill the seams across the process and automation that add delays, effort, and waste  the process.  In other words, we must fix Flow.

An Example of Flow

The following example may be useful to further explain the concept and need of Flow:

Day 1

Jane has been tasked with building an inference model to identify arrythmias using EKG data from heart patients participating in a national study. She needs patient records from multiple affiliated hospitals and in order to get this data, Jane has completed the requisite forms to have a snapshot of the data created and published to her lab. This will allow her to experiment without impacting production systems.

Day 2

Waiting on request to be fulfilled.

Day 3

Jane’s ticket is returned since it was missing the L4 approval in the system. Ticket is closed by service desk. Jane need to reopen the ticket. It isn’t clear where to add approvals, so Jane included a comment/question asking how to link the approval to the request.

Day 4

Bill, the data engineer, called Jane and walked her through the automated approval process and requisite fields needed to send her manager and project sponsor the request. A few hours later, Jane had her approvals and Bill started processing the ticket.

Day 5

Bill put the ticket on hold until the compliance and security approvals and procedures were added to the ticket. Jane got and email regarding the hold and a new ticket in the compliance tracking tool.

Day 6

Waiting on request to be fulfilled.

Day 7

Still waiting on request to be fulfilled.

 

 

Day 8

Jane gets the approval code and link from compliance to policy procedural requirements. She adds that data to the ticket and resubmits her ticket to Data Engineering.

Day 9

Waiting on request to be fulfilled.

Day 10

Jane gets multiple emails as her ticket is passed around to the Backup team to add this snap shot request into the schedule, to the Data Engineering team to write the needed queries and apply the need encryptions to that data and to the Infrastructure team to provision a target cluster for her data.

Day 11

Jane is excite expecting to have her tickets completed.  Instead, the Infrastructure team puts the ticket on hold because Jane hasn’t submitted a ticket for her sandbox environment. There isn’t a target landing zone and the Infrastructure team doesn’t have the needed specifications to build the cluster. Jane submits her ticket copying the request from her last project. It is likely over-spec’d but it is faster than starting from scratch.

Day 12

Jane gets her environments. She updates the Data Engineering ticket with the environment details.

Day 13

Her snapshots are loaded! Excited to get working, Jane opens the environment to find none of her tools are there, and the connectors to the data haven’t been configured yet. She logs another ticket to get license keys and install binaries for the tools. Within the hour, Jane is feverishly installing and configuring tools and database connections. At the end of the day, she finally is ready to start building a model. 

 

Epilogue

To date, Jane has spent well over 100 hours completing tickets, getting approvals, waiting for data, installing software, and configuring the environment. She is no closer to defining the needed model; she is starting to feel pressure from her business sponsor and product owner; and, she is frustrated with the level of service and speed getting the needed tools and data to do her job. Jane is not feeling valued. She doesn’t not feel like she is contributing. In short, Jane is not experiencing good Flow. She is not able to focus her efforts on value-added activities (aka. model development) and she is certainly not experiencing any feeling of accomplishment at work.

Know anyone who has had a similar experience?

 

Can you imagine what happens to Jane if she accidentally requests the wrong data or submits incorrect configurations for the exploratory lab clusters? 

In the example above, it’s clear that the process is broken. It‘s obvious why ROI is trapped in the system. And, it’s understandable why Jane is experiencing dissatisfaction at work. 

Too often, enterprises look to solve these problems by buying new technology, subscribing to a new cloud service, or trying to automate specific steps in a process, like automating cluster provisioning. While many of the actions are or can be components of the solution, none can create Flow. And as such, none will unlock ROI.

Creating Flow

To address Flow, we need to first understand the end-to-end process, and then overlay the experience of the personas as they move through that process. Studying the  experience, we best reveal the undocumented steps and waste in the process. For example, while it may only take 15 minutes of work to provision a compute cluster, the data scientist experiences a 3-day request fulfillment turnaround time (1 day to submit, 1 day to process, and 1 day to deliver).  Even though the provisioning process has been optimized and automated, the end user experience is still slow and painful.

Central to unlocking Flow in the data analytics value cycles is the use of value stream mapping techniques. Value stream mapping is a tool borrowed from Lean to assess and evaluate process efficiency of a repeatable cycle.  Using a value stream, organizations can gain transparency into the AI/ML value cycles and collect data that reveal insights illustrating how and where to improve Flow.  Using Jane’s experience, we can assemble a crude value stream map like the one below. 

In this value stream map, we can understand process efficiency by calculating value-added time (or, the hands-on keyboard time, or eyes-on requirements time) and non-value added time (waiting in queues, loop backs, handoffs, etc.).  Using these two data points, we can them tabulate process efficiency.

Process Efficiency =                                                  Value Added Time                                  

    (Value Added Time + Non-value Added Time) 

In the example below, value-added time is calculated at 4.0 days whereas non-value added time is calculated at 6.1 days.  Using the formula above, we divide value-added time by total elapsed time (value added + non-value added or 10.1 days). The results of the formula is a process efficiency of 39%. When we convert process efficiency ratio into real business impact, we see that it takes Jane and company nearly 2.5 weeks to create 1 week of value.

aiml flow diagram Dell Technologies

Sample Value Stream Map of Jane’s data and environment standup process.

Value Stream Mapping and Flow

Value Stream Maps are great tools for improving process transparency because they span organizational barriers and departmental silos and create enterprise (macro-) level context for team members. Departmental processes or task specific sub-process that are typically invisible, or black boxes, quickly become visible to all personas and roles along the pathway. The shared transparency (data)enables teams to make informed decisions about where and how to invest in people and process. For example, what if an exploratory lab came in a standard size, with built-in and predefined scaling capabilities — would you need to have the infrastructure workflow outlined above?  How could this automation impact process efficiency?

 

The Value Stream Map tool helps to highlight inefficiencies and bottlenecks in the process that impede flow and trap ROI in the system; they help us to define our problem correctly. According to Steve Jobs, “if you define the problem correctly, you almost have the solution.”

To learn how our organization employs Value Stream Mapping, click here. You’ll also find a video, infographic, and a service brief for download.