by Bob Violino

Developing data science skills in-house: Real-world lessons

Oct 02, 2018
AnalyticsData ScienceIT Leadership

Organizations are looking inward to fill data science needs, developing the culture, courses and programs necessary to deepen data analytics expertise.

data science certification eye with raining binary numbers
Credit: Thinkstock

The need for data scientists remains strong among companies in virtually every industry, as they look to launch big data and analytics projects and gain more value and insights from their data resources.

The demand for these professionals continues to exceed the supply by a considerable margin, however, and there are no signs that will change any time soon.

Online jobs site cited data scientist as one of the best jobs of 2018, with a projected growth in demand of 19 percent for this year. CareerCast evaluates U.S. Bureau of Labor Statistics data on growth outlook as well as industry hiring trends, trade statistics, university graduate employment data, and the site’s own database of listings to determine what factors are driving hiring needs.

The data science profession has regularly ranked among the top careers in terms of demand, with growth driven by the need among organizations to expertly analyze data and transform it into actionable information.

For lots of organizations, the struggle to fill these positions is fierce. Hiring experts from outside can certainly be part of the strategy for building a data science team, but given the extreme competition, some organizations are turning to their own ranks to develop the data science talent they need.

Here is a look at how several organizations are encouraging data-driven cultures and developing deeper data analytics expertise in-house.

Fostering a data-driven culture at Carnival

Data science is a strategic priority for the Risk Advisory and Assurance Service Department at Carnival, which operates the Carnival Cruise line. Leadership at the department, which provides internal auditing services, strongly supports professional growth and training in this area, says Daniel Bukowski, the department’s manager.

For example, the department supported Bukowski, who has a background in auditing and accounting but no formal education in IT or technical knowledge, in taking the Udacity Predictive Analytics Nanodegree program. Udacity provides a host of online higher educational programs in areas such as artificial intelligence, data science, programming and development, and autonomous systems.

“Many other auditors in the department see the importance of data analytics and are pursuing leadership-supported training,” including how to use analytics tools from vendors such as Alteryx, Tableau, and others, Bukowski says. The department’s annual retreat in July 2018 included an external training session about data visualization and an in-house training program about audit-related data analytics initiatives, he said.

The audit department’s two data scientists were hired for data scientist roles. However, the department-sponsored training has helped Bukowski and several other auditors become more data-literate and able to apply data analytics concepts to audits and investigations.

“Not all auditors need to become data scientists, but it is critical that they become data-literate,” Bukowski says. He has taken the additional step of enrolling in a master of science data analytics program “because I see how important data will be throughout my career,” he says.

With the additional education in data science/analytics, Bukowski’s role has evolved during the past 12 months from primarily performing individual audits and investigations to providing data analytics support to colleagues performing their own audits and investigations.

Using a variety of analytics tools enables Bukowski to blend and analyze large data sets in ways that his colleagues could not by using only Excel spreadsheets. “This has resulted in findings on multiple audits that are unlikely to have been discovered analyzing smaller data sets in Excel,” he says.

The audit department is initiating consulting projects based on audit findings, to provide additional analytics-driven value to Carnival and its operating companies, Bukowski says.

Engaging engineers in data science roles at SessionM

SessionM, which provides a customer data and engagement platform, is creating a team of dedicated data science engineers (DSEs) to design and write artificial intelligence (AI) software for production. These individuals are knowledgeable of machine learning (ML), statistics, and decision theory, says Amelio Vázquez-Reina, vice president of data science, AI and ML at SessionM.

Their primary responsibility is to automate the generation of insights, forecasts, and recommendations and to build software products that provide the automatic execution of decisions across the company, Vázquez-Reina says.

Beyond the development of formal DSEs, SessionM has several initiatives to help develop data science literacy across the company, Vázquez-Reina says. “We hold regular meetings with other departments where DSEs are asked to explain their data models and solutions to our sales, business analysts, product, and solution architecture teams,” he says.

These meetings have two objectives. One is to educate employees about SessionM’s data science strategy, methods, and best practices. The other is to help everyone at the company understand and evangelize what it calls its AI “value generation chain.” This is a process that includes gathering data from each customer, clearly specifying customer goals, and emphasizing experimentation in software development to maximize outcomes and insights for customers.

In addition, the company offers opportunities for software engineers to contribute to its data science software through its agile development process and data science services for customers. SessionM also hosts meetings and social events centered around AI.

“These meetings begin with a technical presentation of a SessionM DSE describing a problem of interest to the company, and proceed through a mathematical characterization of the problem that is amenable to all [software engineers] in the company, ending with an open-ended discussion around the solution chosen, the implementation, and any trade-offs and alternatives explored along the way.”

Developing data science talent at Ogury

Ogury, a company that provides mobile data technology, receives more than a terabyte of data every day, according to Louis-Marie Brierre, the company’s CTO. Getting the resources, skills, and capabilities to deal with such massive amounts of information requires a dedicated and talented data science team. And one of the keys to maintaining such a team is creating an inviting place to work.

“The best way we motivate our team is by giving them room to learn and control their personal growth,” Brierre says. “We empower our data teams to own and manage their projects with full accountability.”

Data scientists at the company work closely with data engineers, as well as development and product teams. “This grants them an understanding of the business, and enables them to understand the roles and impact they have on the company,” Brierre says. “We never limit the computation power of the data they are manipulating and the idea they want to test.”

With the wide diversity of projects and teams, “we like to challenge people and offer them new paths of growth by joining new departments every 12 or 18 months,” Brierre says. “This gives them an opportunity to leave their comfort zone and discover new team partners and projects.”

Ogury also aims to keep its data scientists well trained.

“I’ve witnessed many companies eagerly waiting for the arrival of a new data scientist,” says Christophe Thibault, chief algorithms officer at Ogury. “These companies believe that he or she will come in and save their business and boost all [performance indicators]. Yes, data scientists and analysts are key players in the organization, however they still need to be nurtured and trained like other valued team members.”

In order to build a team of data scientists, companies have to identify great talent, Thibault says. “But it’s also their responsibility to prepare the organization for their arrival and set them up for success,” he says.

Since 2014, Ogury has adopted several practices to lure and keep data scientists. One is to eliminate technical limitations. “All our data scientists have access to a sandbox environment on the cloud, so they can measure and holistically compare their computation power to the data they are manipulating and the idea they want to test,” Thibault says.

Another is to encourage collaboration. Data scientists and data engineers possess different skills and vocabulary, Thibault says. But it’s vital for an organization’s data analytics efforts for them to work and collaborate together. “The free flow of knowledge within an organization is how businesses thrive,” he says.

A third practice is to fit algorithms to the data, not the other way around. “Data scientists work closely with business analysts to understand data,” Thibault says. “Especially at Ogury, where we have unique, granular first-party data, it takes time for data scientists to understand the information. Most importantly, our teams must be sure they use the algorithms that will fit the data perfectly, and not twist the data to fit it into well-known or predetermined algorithms just because it’s the easy way out.”

Aiding advanced degrees at Micron

Ongoing learning and education is a top priority for computer memory technology provider Micron. “Because so much of data science requires in-depth mastery of statistics and machine learning, [the company] supports many people in the pursuit of advanced degrees in this space,” says Micron CIO Trevor Schulze.

In addition, Micron employees find massive online open courses (MOOCs) to be beneficial in shoring up certain skills, Schulze says. The company also enables learning from peers by supporting attending, and presenting at conferences externally as well as internally.

Micron employs hundreds of people working in data science teams around the world. About half of its data scientists on these teams came from different roles in the company, typically engineering, Schulze says.

“Successful transitions into data science happen when people have strong data fundamentals, inquisitive and exploratory mindsets, and most critically [the] aptitude to master statistics and machine learning methods,” Schulze says. “What these people may lack in formal, advanced educations they tend to make up for in industry and data knowledge.”

The rise of data science has had a big impact on the company. “For more than three decades, robotic and computer automation have served as the critical enabler for developing and producing the next generation’s memory chips,” Schulze says. “However, automation alone can no longer advance the industry forward. True transformation at Micron is now occurring with data science at the core of many manufacturing processes and business decisions.”

Mentoring and making training a top priority at McAfee

Security technology company McAfee has created an Analytic Center of Excellence (ACE) with a framework of value proposition, evangelism (including training and mentoring), models/algorithms, and data management. It’s supported by the CTO as well as key vice presidents, says Celeste Fralick, chief data scientist at the company.

To realize the framework, ACE participants have regularly scheduled “tech talks” about algorithms; and McAfee has created eight Community of Practices (CoPs), workgroups for areas such as education, human resources, industry/academia partnerships. It has also formed technical workgroups such as an Analytic Review Board, Adversarial Machine Learning, and an Analytic Portal.

The ACE is global, with more than 150 people from all skill levels in the company. “We are now instituting a mentoring program as well as a short course on data monetization, and probing questions non-data scientists such as managers can ask about algorithms and model development,” Fralick says.

In addition, McAfee sponsors in-depth training, including an introduction to analytics, cleaning and pre-processing data, and introduction to models and machine learning.

“While we do not teach a ‘tool’ per se, we do teach the general ‘gotchas’ and concepts with the specific subject matter,” Fralick says. “Tools change, [but] the math and the gotchas that can foul an algorithm generally do not.”

These efforts were intentionally started slowly and voluntarily, to gain an initial positive impact in a non-adversarial atmosphere, Fralick says. Demand for classes has skyrocketed, she says.

“We also recommend specific external courses, books, and degree or certificate programs,” Fralick says. “We find that computer scientists are generally not taught the critical elements of data science. And when they are informed, they respond passionately, understanding that data is not just gathered and a quick model applied. Much more goes into developing successful analytics.”

For the trained data scientists, projects are “evolving up the pyramid of knowledge and intelligence, from statistics to machine learning to deep learning to artificial intelligence,” Fralick says. “Human-machine teaming is critical in the journey as machine-driven algorithms augment the human decisions.”

Specific analytic development requirements are being integrated into software product lifecycles. “It is our intent for everyone in the company to have an introductory course in analytics,” Fralick says. “The overall impact of the data science efforts on the organization and our business/data strategy have congealed with higher data volumes, customer expectations, and data fusion — to enable our business to be data- and model-driven.”

Coursework and collaboration at Ibotta

Over the past two years, Ibotta, a developer of mobile shopping apps, has built up an analytics team from within the organization through formal and informal training.

The team has developed six-week long courses in SQL, Python and Spark as well as short introductory training sessions on topics such as Tips and Tricks for Effectively Communicating Analytical Results, Pros and Cons of Frequentist vs. Bayesian Statistics, and Building Neural Networks leveraging TensorFlow.

“In addition, we hold bi-weekly brainstorming sessions where team members discuss and ideate on a variety of analytics and data science topics, and how each could be leveraged throughout the company,” says Laura Spencer, vice president of data analytics and science.

The company also heavily focuses on collaborative projects between analysts with different skill sets to encourage sharing of skills, capabilities and constraints.

“For example, we recently conducted a retention deep-dive with experts in marketing analytics, machine learning, and user research to build [recommendations] for the business,” Spencer says. “We also have several initiatives to encourage our employees to continue learning externally and bringing new tools and methodologies back to the team.”

Ibotta hosts and attends various big data and data science meetings near its headquarters. “We also sponsor each data scientist to attend a conference of their choosing each year and in return, the individual provides training of their conference learnings back to the rest of the organization,” Spencer says.

Over the past couple of years Ibotta’s analytics team has grown to about 45 individuals and includes skill sets such as data engineering, statistics, and machine learning.