How to build a successful data science training program

Data scientists are in short supply. Some companies are filling the gap by setting up training programs to reskill employees for data science roles.

How to build a successful data science training program
Metamorworks / Getty Images

Technology professionals who know how to help organizations get the most out of their information resources — data scientists, in particular — are in high demand and short supply.

Some enterprises are taking matters into their own hands by creating data science training programs to upskill or cross-train employees to be data scientists.

Data science is still new territory for a lot of companies, and setting up and maintaining such a program can come with challenges. Here are some tips for how to go about reskilling your employees for data science roles successfully.

Create a data science culture

Organizations should embrace the idea that anyone can potentially become a data scientist, and creating a culture that supports that premise is important.

“The critical piece is determining that we need to shift an entire culture toward data rather than having a specific set of people,” says Frank Vanderwall, lead data scientist at Hiebing, a brand development and marketing communications agency.

Frank Vanderwall, lead data scientist, Hiebing Hiebing

Frank Vanderwall, lead data scientist, Hiebing

“A parallel we use would be digital in the ’90s,” Vanderwall says. “At that time, you had a digital team inside of organizations like ours. It was a discipline that a handful of people mastered. Today, our expectation is that every team is digital. That sped up the transition and the fluency throughout the organization rather than just putting the burden on a handful of people.”

Timing the transition for data science will be specific to each organization, Vanderwall says, “and while we aren’t quite at that critical of a juncture yet, that time is coming.”

Part of the cultural shift involves using language that practically anyone can understand. “This seems basic but it is easy to drive past, especially for the coaches who are so deeply immersed in a way of thinking,” Vanderwall says. “We need to start with a shared understanding of language. Data scientists are very comfortable using certain terms, but those can be intimidating to others until we break them down to a common understanding.”

Sometimes it’s a matter of making sure everyone understands the terminology. Other times it means replacing the terminology with more approachable language. “The coaches need to understand the mindset of the people they are coaching every bit as much as the people learning need to understand the new content,” Vanderwall says. “Too often the focus is solely on the student and not the teacher.”

Also important to building a data science culture is the continual passing on of knowledge.

“It is critical our data scientists are empowering others to create value from data on their own, not just doing it for them,” Vanderwall says. “It is also critical that those who are trained are then empowered to train others in a similar way. This approach proliferates knowledge and skills across the organization more quickly and at a higher level of applied acumen. It creates a true data-driven culture.”

Hiebing first built out a marketing science team responsible for

understanding the right questions to ask about data and using the best available methods to interpret outputs appropriately.

“The next challenge though has been to extend that to the rest of the organization, because our marketing science team can’t be intimately involved in all the projects we have going on,” Vanderwall says. “We need to train others in the organization so that we can actually apply data-driven insights consistently to all projects.”

The company is in the process of building a framework that combines formal external training such as citizen data science online courses — which tend to be technical in nature — along with internal seminars that tend to be more contextual, Vanderwall says. This program is now being rolled out to the organization.

Team up with colleges and universities

Many institutions of higher learning have launched data science programs in recent years, and these resources can be excellent partners in establishing training programs for your organization.

May Yap, senior vice president and CIO, Jabil Jabil

May Yap, senior vice president and CIO, Jabil

Jabil, a provider of manufacturing solutions, has developed a data science training program in partnership with local universities, says

May Yap, senior vice president and CIO. The program strives to upskill business professionals with statistical analysis, computational mathematics, and problem-solving with domain-specific data, she says.

“We feel that, similar to other methodologies such as Lean or Six Sigma, these skills need to be dispersed across all business units and are technical skills that will advance Jabil's goal and vision,” Yap says.

Partnerships with colleges and universities is a key component of the program. But it can’t be just any institution that makes a suitable partner for such a program.

“Choose your university partners wisely, identifying partners that have invested in data science programs and departments already,” Yap says. “Also, choose partners that are open to aligning with the industry in which your company serves.”

Leveraging formal education partners is a good idea for two reasons, Yap says. First, they will not have a bias with regard to technology and will allow the program to evolve over time as analytics applications and tools advance. Second, working with institutions creates a recruiting pool for the organization.

Jabil’s data science program has successful upskilled about 200 employees globally, and has provided the company with predictive insights that deliver savings and greater value to its customers through increased speed of manufacturing, product quality, and innovation.

Among the projects that employees have undertaken are predictive replacement of tooling, reducing scrap and cost of manufacturing, and optimized pricing on mechanical parts.

Focus on continual improvement

The data science training program needs to stress ongoing improvement and development of talent, otherwise there’s a greater likelihood that existing data scientists will leave.

Anthony Scriffignano, chief data scientist, Dun & Bradstreet Dun & Bradstreet

Anthony Scriffignano, chief data scientist, Dun & Bradstreet

“Be sure to have an intentional process of retaining the best data scientists, who tend to get bored and/or seek alternative challenges elsewhere in order to stay ‘fresh,’” says Anthony Scriffignano, chief data scientist at Dun & Bradstreet, which provides commercial data, analytics, and insights for businesses.

To illustrate the point, Scriffignano relates an experience he had while working at an organization that was going through a massive IT transformation.

“A senior manager raised an issue relating to training a group of workers who traditionally did not need to use any computer skills to do their job,” Scriffignano says. The plan was to prepare and deliver about four hours of training for several thousand people around the world, in multiple languages.

Guaranteeing consistency in the training and delivering it on time in the context of the project timeline was no small task, even though the individual commitment on the part of those who required training was small, Scriffignano says.

As the discussion of cost picked up, the manager who originally raised the need for training began to push back on the need to deliver the training at all. “Exasperated, the person leading the meeting asked why the manager had raised the need for the training and yet was later seeming to back down on the desire to execute,” Scriffignano says.

The manager expressed concern that the workers would now become skilled in a new area, and this increased skill could cause issues with their compensation and contracts. “There was also a frustrated concern about retention: ‘What happens if we train them and then they leave?’ came the objection,” Scriffignano says. “The reply was perfect: ‘What happens if you don’t train them and they stay?’ The point was made loud and clear: Training and continuous improvement are [essential] in the face of technology innovation.”

Leverage real business problems and challenges

Theoretical examples are fine as part of a training program, but students also need to know how to put data science into practice in the “real world.”

Jabil’s program takes a “scenario-based” approach to training, Yap says. “Team up the employees in groups of three or four to work collectively on the problem, as they are educated on the process, techniques, and practices,” she says. “Each problem should have a business sponsor and at least one of the students should have domain expertise within the business area of focus for the business problem.”

Another crucial part of the company’s program is to seed it with technical professionals from the organization who serve as teaching assistants. “The role of this teaching assistant would be very similar to how black belt coaches are seeded within an organization to support Lean or Six Sigma initiatives,” Yap says.

Educate executives

Jabil also provides an Executive Data Science class, ranging from half a day to one full day. These are designed to help executives understand the theory of data science and common terminology used by data scientists, while guiding them on how to appreciate data science projects so they can effectively advocate for these projects.

Nicky Walker, senior director of the data and analytics accelerator, GSK GSK

Nicky Walker, senior director of the data and analytics accelerator, GSK

Healthcare company GSK also emphasizes educating executives. “Data science literacy isn’t just for the data scientist,” says Nicky Walker, senior director of the data and analytics accelerator at GSK. “It is just as important to educate leaders and managers on being literate and conversant in data science as it is to build deep expertise in data science disciplines.”

Educating many with data science literacy builds confidence and trust in the algorithms, so managers can make decisions with confidence based on analytics. “Getting great at telling stories of where the data science delivers value creates the pull for more data science and builds it into the DNA of the company,” Walker says. “This is a critical component of the data science program. We know that our data science experts feel valued and rewarded when the response from their business partners is based on understanding.”

Highlight the fact that data science expertise is iterative

Students and business professionals taking part in data science training should find themselves going back and forth between a number of steps, Yap says. These include problem understanding, data understanding, data preparation, modelling, and results evaluation.

“They should use the process to refine the business problem, generate new ideas, and iterate throughout them to identify insights or other business gaps that may need to be resolved to drive the expected predictive insights,” Yap says.

Some of these gaps will most likely include data variation, data access, data quality, and in some cases missing technology needed to automate analysis or manage large data sets, Yap says. 

“This could be as simple as applying artificial intelligence/machine learning technologies to identify anomalies, or [building] classifications models that will be used to improve data and understanding,” she says.

Copyright © 2020 IDG Communications, Inc.

Download CIO's Winter 2021 digital issue: Supercharging IT innovation