by Paddy Padmanabhan

Data Scientists: The talent crunch (that isn’t?), FOMO and Spanish silver

May 29, 20156 mins
AnalyticsBig DataHealthcare Industry

The talent shortage is apparently here and data scientist salaries have gone through the roof. Time for a reversion to the mean?

The advanced analytics space has been going through a severe case of FOMO (Fear of Missing Out) for the past couple of years.

Ever since HBR published an article in 2012 declaring data scientists to be the sexiest jobs of the century, in 2013, and the McKinsey Global Institute (MGI), published a report that by 2018 the United States will experience a shortage of 190,000 skilled data scientists, everyone has been rushing in to collect and hoard data scientists. Smart job applicants started including “data scientist” as a skill in their resumes (and were rewarded with exciting job offers), regardless of their actual qualifications.  

And all this, even before organizations had figured out what problems to solve with these highly qualified individuals, and what other investments needed to be made to make them productive and effective.

Competition for talent is likely to be robust for people with experience in healthcare data and analytics. A recent study mentions that 37% of the respondents indicated a lack of qualified staff as a factor in adoption rates for analytics. Another study highlights some nuances to this talent market:

–Data scientists have a median of only six years of experience, but are highly educated (92 percent have at least a master’s degree, 48 percent have a PhD), overwhelmingly male (89 percent), and a disproportionately large number are foreign-born (36 percent).

–Over one-third are employed on the West Coast (36 percent) and almost half work for firms in the technology and gaming industries (43 percent).

The study indicated median compensation of data scientists can range from $91,000 with one to three years of experience up to $250,000 for managers leading teams of 10 or more.

What is wrong with this picture?

Consider this first:

–There are several key factors impacting efforts to increase adoption rates for analytics, and one of them is the talent shortage.

–The talent pool for data scientists doesn’t scale well, at least in the United States.

— In addition, it appears that younger workers and recent college grads prefer to work in smaller organizations that provide more challenging data analytics problems to solve.

Here is my assessment:

  1. It may well be that the scarce pool of highly talented data scientists is being sucked up by hot silicon valley startups funded by deep-pocketed VC’s. Since the data scientist pool cannot scale quickly or adequately to meet demand, we can assume that these data scientists are contributing to the development of analytics platforms that will provide scalable options for enterprises and reduce the need for large teams of very expensive data scientists. Indeed, the top paying job listings at Facebook  and LinkedIn are for data scientists—not software engineers.
  2. Many large enterprises are developing creative solutions, such as a “team approach” which tries to fulfill a data scientist’s job requirement with two or three $80,000 individuals instead of one $250,000 rock star.

In my view, some emerging trends are likely to address this problem of talent shortage in the near term:

  1. Outsourcing: There is a growing movement to outsource analytics services to India-based firms that have seized the opportunity to address this talent gap by grooming college grads to become data scientists by setting up in-house academies that teach them the basics of statistics, data, and technology. Remember Y2K?  We had legions of Indian programmers who stepped in to fill a giant talent shortage on that occasion, and India’s tech sector has stepped up every time since then, be it ERP package implementation skills, or contact center management. Why would it be any different with analysts and data scientists?
  2. Automation: We are witnessing the coming age of extreme automation and simplification, and the relentless march of Moore’s Law (which states roughly that computing power doubles every 18 months, with a corresponding drop in prices). We are already in the middle of this transformation in labor intensive functions such as data center management where robotic process automation (RPA) tools are quickly replacing human labor.
  3. Alternate education: As with any scarce commodity, the marketplace always steps in to find substitutes. In this case, PhD’s in statistics and applied math will soon be supplemented ( if not supplanted) by non-technical, self-taught and Coursera-educated individuals who will offer a lower-priced option in the labor market. We’ve seen this happen again and again in the software programming world, so this time will be no different.

The impact of these forces will cause an inevitable reversion to the mean (to use data scientist-speak). Translation: inflated salaries will drop back to average levels pre-inflation when alternate sources of talent supply are identified and brought on stream.

As I write this, there are many startups in Silicon Valley and elsewhere in the world, that are attempting to simplify the entire analytics process by offering a complete solution, including prebuilt machine learning algorithms, in an end-to-end technology stack sitting on a secure and scalable cloud environment.

Big companies are latching on to this as well.

Microsoft Azure’s machine learning platform is a cloud-based predictive analytics engine that comes with detailed tutorials on how to select and deploy models. Amazon’s machine learning as-a-service provides visualization tools and wizards that guide you through the process of creating machine learning models without having to learn complex algorithms and technology.

The final nail in the coffin will be hammered in by enterprises and business leaders themselves, who are going to come to their senses and put a stop to the hiring madness when the returns on these investments don’t add up to the promises.

History teaches us enough lessons. The Spanish conquests of the Americas and their subsequent occupation of South America from the 15th through the 19th centuries was largely about silver, and the hoarding of silver presumably underwrote the many religious and political wars the Spanish kings waged across Europe. The crash of silver’s value contributed in no small measure to the decline of the empire. Data scientists are the silver coin today that funds advanced analytics programs, which are expected to generate competitive advantage to businesses. The question is: how long will it maintain its value?