by Raman Mehta

Data Scientists are the Perfect Cure for Blue Jean Syndrome

Dec 22, 20143 mins
AnalyticsBig DataData Mining

A practitioner CIO's perspective on the emerging role of data scientists and the true potential of big data analytics in solving non-trivial business problems.

I was driving with my daughter who unlike me is a great country music aficionado. There was one song that is very rhythmic so the tune was on my mind for some time during our rides. I absolutely had no idea who the singer was and what the lyrics were. I turned to my daughter and asked, “Dear, do you remember the song that had words ‘Blue Jeans’ in it?” She broke out in a hearty laughter and said, “Dad, practically half the country songs have a Blue Jean reference in them. You have to provide me with some actionable, real clue to help me identify the song.”

Isn’t this the current state of our business intelligence efforts? We all have been guilty of stating the obvious. We do report factually correct information but often times it is of little or no value to solve a real world business problem, help understand potential churning customers, better use of marketing dollars, product mixes, and supply chain bottlenecks.

The big data further compounds this problem. The real signals are buried deep inside the noise. Big data is messy and complicated by its very nature with varying degrees of veracity, ambiguity and conflicting sources of information. The transformation of Big Data needs to go from data to information, to knowledge, to wisdom, to decisions, to repeatable and predictable results for it be consistently meaningful for the business. It is about time we start making a difference between quantitative analytics and real-world analytics.

data scientist keyboard

Data Scientist

Enter the data scientist — an ideal profile that is well versed with the software and business skills and has the statistical competence required to analyze and derive value from big data. The data science is a combination of two distinct skill sets. One is the traditional “tech-centric” skill set that helps curate, cleanse, secure and contextualize the big data. The other skill set is the “pure analytics” that bring business domain knowledge and a data mining mindset.

We need to combine the two skill sets to find patterns such as which patients are most likely to be readmitted to the hospital within 30 days of a discharge. It requires tremendous amount of big data analysis. One needs to go through hospital infection rates, patient habits, quality of care at home, particular hospital’s readmission performance compared to the national average and so many other factors. The team solving this problem needs to have both technical skills in big data that understands statistical modeling and privacy issues. The team also needs to have knowledge of health care domain with deep understanding of ICD (International Classification of Diseases) and CMS (Center for Medicare and Medical Services) KPIs.

There are so many facets to data science including statistics, visualizations, machine learning, big data and data mining. These combined with specific business domain knowledge can help solve non-trivial business problems. We need to move beyond simply telling business who are the top N highest margin vendors. We need to tell them which vendors are likely to miss delivery times and cause disruption to the most profitable product lines. Which customers are likely to default on a loan, which customers are likely to churn to a competitor?

We require multi-disciplinary approaches to manage the analytics life cycle processes by which we can test hypotheses and transform them into actionable value-add business insights.

By the way, the song I was referring to is “Whatever She’s Got,” by David Nail. Take a listen — the tune may get in your head but you might also earn the respect of your teenage kids!