Becoming an ML information factory – 6 lessons we can learn from lean manufacturing

machine learning hpe lean manufacturing

According to a recent Forrester Research report, machine learning (ML) is vital to success in today’s market. A whopping 98% of IT leaders believe machine learning operations (MLOps) will give their company a decisive competitive edge. Yet, only 6% of companies feel their MLOps capabilities are mature enough to benefit from the opportunity.

So, what’s going on?

What exactly is ML and MLOps?

To find the answer, let’s start by defining terms. ML is a type of artificial intelligence that enables learning from data without human intervention. Successful businesses are using ML to optimise every aspect of their business: drive employee productivity, improve customer satisfaction, and increase revenue.

Here’s the challenge – while the amount of data has increased almost exponentially over the years, the ability to organise and analyse it using ML has lagged significantly.  An even bigger challenge involves operationalising ML models into a production setting to make dumb applications a whole lot smarter. The Forrester report found that only 14% of respondents had a repeatable and robust process for operationalising ML models into a production setting.

One approach many organizations are taking is the adoption of machine learning operations (MLOps).  MLOps is a set of practices for collaboration and communication between data scientists and operational teams across the complete ML lifecycle. In many ways, MLOps is trying to achieve the same benefits of throughput, efficiency, and quality for ML that DevOps is achieving for agile software development.

Adopting MLOps alone will not solve the problems enterprises face trying to implement ML—it’s a first, important step, but more is needed. Organizations that are successful in transforming their ML capabilities have augmented MLOps with key processes, tooling, and continuous improvement practices. Some of these practices may sound familiar, as they come directly from lessons learned in industrial manufacturing. 

6 industrialized lessons

For over 50 years, manufacturing companies have implemented Six Sigma and lean manufacturing techniques to solve quality issues. Today, organizations are using some of these same techniques to create value from their data. In essence, they are becoming information factories.

1. Jidoka

It’s difficult to overstate the role automation has played in modern production engineering – transforming product quality, productivity, and throughput. Jidoka is a Japanese term for automation with human intelligence, giving machines and operators the ability to stop work if they detect a problem. The problem can then be corrected immediately, rather than waiting until the end of the production line.

The concept of Jidoka can do the same thing for the analytics production line. Self-service with Jidoka capabilities can provision the infrastructure, tools, and data needs for each of the personas involved in the ML process. This type of automation drives efficiency and guarantees compliance to standards. The result? No more time wasted waiting for access to a suitable environment or trying to configure a new tool freshly downloaded from the web. Each phase in the ML process can be automatically scheduled, making the entire system predictable and efficient.

2. Tooling

Tooling plays a fundamental role in contemporary production facilities. Used wisely and with the right checks and balances, tooling helps deliver scale. It can reduce the skills required while improving quality, time to value, throughput, and velocity.

Today’s information factory will require a range of tools to suit the role of each persona and to meet the demands of each phase of production. As new, more challenging business problems are tackled, new tooling will be required. This brings us to the next essential element of the information factory: a research and development lab.

3. Research and development (R&D) lab

Until recently, most ML tooling focused almost exclusively on model development, but things are changing. New ML tooling addresses operational processes and the overall model lifecycle management. These new tools can improve the efficacy of ML models as well as support downstream operations, ethics, and model governance.

Using an R&D lab, data scientists can evaluate new tools in a safe and managed environment, document best practices, and evaluate potential benefits. Once blessed for use by the broader team, new tools can be packaged and included in an application catalogue, available in the self-service provisioning process.

4. Kaizen

Kaizen is a Japanese term meaning change for the better or continuous improvement. Seen as more of a philosophy than a work practice, it ensures maximum quality, the elimination of waste, and improvement in efficiency.

As organizations start to scale out their data science capability and capacity, new needs will emerge. These may include more opportunities to standardize or automate processes.

The integrated nature of the work in the information factory and the teams involved (including DataOps, data science, MLOps, DevOps, operations, and business intelligence) lends itself to Kaizen practices. Each person will have a different perspective on challenges; thus, each should be encouraged to continually evaluate how the information factory process can be improved.

5. Supply chain

Over the years, manufacturers have optimized their supply chain by using a just-in-time (JIT) approach to parts delivery. JIT keeps inventory to a minimum and eliminates time and effort in moving parts to and from stock.

The information factory needs to take care of data in the same fashion. Although most organizations are awash with data in multiple data warehouses, operational data stores, and data lakes, finding and accessing the useful data is often the first challenge. In many cases, the data scientist needs a data engineer to replicate large datasets, as read-write access is required to transform the data and make it suitable for ML model building. This kind of delay is a long way from the ideal JIT position.

Organizations that are winning in the ML race pay attention to the data supply chain with a comprehensive data catalogue and business glossary. They also regularly assess and report on data quality. Most also use read-only snapshots, rather than replicate data. And many are now starting to explore specific ML feature-stores, which greatly accelerates model development by standardising the way data is prepared.

6. Poka-yoke

Last on my list is poka-yoke, which in Japanese means mistake-proofing. A good example of this is the SIM-card in your mobile phone, which manufacturers shape in a certain way to prevent you from inserting it incorrectly.

Poka-yoke helps prevent defects by stopping errors as they occur. This type of mistake-proofing is part of the continuous improvement process I describe above (Kaizen.) Although the idea of poka-yoke may seem a little frivolous, imagine if we embed it into every process we touch. As data scientists conduct more complex tasks using more automated tooling, poka-yoke will become invaluable.

HPE can help: Providing processes, tools, and talent to succeed with ML

ML and MLOps are critical for business success, yet most organizations are failing to deliver on their ambitions. The first step to solving this challenge is to implement MLOps. Yet, MLOps alone is not enough. By applying these six proven techniques, organizations can create value from their data to become more successful.

For more information, listen to this podcast with analyst Dana Gardner and myself: How to Industrialize Data Science to Attain Mastery of Repeatable Intelligence Delivery. You can also download the complete Forrester report, Operationalize Machine Learning, to learn more about how MLOps can help your business succeed. Also, visit HPE Ezmeral Software – a solution that brings DevOps-like agility to the entire machine learning lifecycle. You can also take some free MLOps on-demand training.


About Doug Cackett

doug cackett headshot 1 125x125
With more than 25 years of experience in the Information Management and Data Science arena, Doug Cackett has worked with a variety of businesses across Europe, Middle East, and Africa.  His expertise is with businesses who deal with ultra-high-volume data or seek to consolidate complex information delivery capabilities linked to AI/ML solutions that create unprecedented commercial value.

Copyright © 2020 IDG Communications, Inc.