What is a data scientist?
Data scientists are analytical data experts who use data science to discover insights from massive amounts of structured and unstructured data to help shape or meet specific business needs and goals. Data scientists are becoming increasingly important in business, as organizations rely more heavily on data analytics to drive decision-making and lean on automation and machine learning as core components of their IT strategies.
Data scientist job description
A data scientist’s main objective is to organize and analyze data, often using software specifically designed for the task. The final results of a data scientist’s analysis must be easy enough for all invested stakeholders to understand — especially those working outside of IT.
A data scientist’s approach to data analysis depends on their industry and the specific needs of the business or department they are working for. Before a data scientist can find meaning in structured or unstructured data, business leaders and department managers must communicate what they’re looking for. As such, a data scientist must have enough business domain expertise to translate company or departmental goals into data-based deliverables such as prediction engines, pattern detection analysis, optimization algorithms, and the like.
For more on data scientist job descriptions from a hiring perspective, see “Data scientist job description: Tips for landing top talent.”
Data scientist vs. data analyst
Data scientists often work with data analysts, but their roles differ considerably. Data scientists are often engaged in long-term research and prediction, while data analysts seek to support business leaders in making tactical decisions through reporting and ad hoc queries aimed at describing the current state of reality for their organizations based on present and historical data.
Thus, the difference between the work of data analysts and that of data scientists often comes down to timescale. A data analyst might help an organization better understand how its customers use its product in the present moment, whereas a data scientist might use insights generated from that data analysis to help design a new product that anticipates future customer needs.
Data scientist salary
Data science is a fast growing field, with the BLS predicting job growth of 22% from 2020 to 2030. Data scientist is also proving to be a satisfying long-term career path, with Glassdoor’s 50 Best Jobs in America rank data scientist the third-best job in the US.
According to data from Robert Half’s 2021 Technology and IT Salary Guide, the average salary for data scientists, based on experience, breaks down as follows:
- 25th percentile: $109,000
- 50th percentile: $129,000
- 75th percentile: $156,500
- 95th percentile: $185,750
Data scientist responsibilities
A data scientist’s chief responsibility is data analysis, which begins with data collection and ends with business decisions based on analytic results.
The data that data scientists analyze draws from many sources, including structured, unstructured, or semi-structured data. The more high-quality data available to data scientists, the more parameters they can include in a given model, and the more data they will have on hand for training their models.
Structured data is organized, typically by categories that make it easy for computers to sort, read, and organize automatically. This includes data collected by services, products, and electronic devices, but rarely data collected from human input. Website traffic data, sales figures, bank accounts, or GPS coordinates collected by your smartphone — these are structured forms of data.
Unstructured data, the fastest-growing form of data, comes more likely from human input — customer reviews, emails, videos, social media posts, etc. This data is more difficult to sort through and less efficient to manage with technology, thus requiring a bigger investment to maintain and analyze. Businesses typically rely on keywords to make sense of unstructured data to pull out relevant data using searchable terms.
Semi-structured data falls between the two. It doesn’t conform to a data model but does have associated metadata that can be used to group it. Examples include emails, binary executables, zipped files, websites, etc.
Typically, businesses employ data scientists to handle unstructured data and semi-structured data, whereas other IT personnel manage and maintain structured data. Yes, data scientists do deal with lots of structured data, but businesses increasingly seek to leverage unstructured data in service of revenue goal, making approaches to unstructured data key to the data scientist role.
For further insight into the working lives of data scientists, see “What does a data scientist do? 7 of these in-demand professionals offer their insights.”
Data scientist requirements
Each industry has its own data profile for data scientists to analyze. Here are some common forms of analysis data scientists are likely to perform in a variety of industries, according to the BLS.
Business: Data analysis of business data can inform decisions around efficiency, inventory, production errors, customer loyalty, and more.
E-commerce: Now that websites collect more than purchase data, data scientists help e-commerce businesses improve customer service, find trends, and develop services or products.
Finance: Data on accounts, credit and debit transactions, and similar financial data are vital to a functioning business. But for data scientists in the finance industry, security and compliance, including fraud detection, are also major concerns.
Government: Big data helps governments form decisions, support constituents, and monitor overall satisfaction. As in the finance sector, security and compliance are paramount concerns for data scientists.
Science: Thanks to recent IT advances, scientists today can better collect, share, and analyze data from experiments. Data scientists can help with this process.
Social networking: Social networking data can inform targeted advertising, improve customer satisfaction, establish trends in location data, and enhance features and services.
Healthcare: Electronic medical records require a dedication to big data, security, and compliance. Here, data scientists can help improve health services and uncover trends that might go unnoticed otherwise.
Data scientist skills
According to William Chen, Data Science Manager at Quora, the top five skills for data scientists include a mix of hard and soft skills:
- Programming: The “most fundamental of a data scientist’s skill set,” programming improves your statistics skills, helps you “analyze large datasets,” and gives you the ability to create your own tools, Chen says.
- Quantitative analysis: Quantitative analysis improves your ability to run experimental analysis, scale your data strategy, and help you implement machine learning.
- Product intuition: Understanding products will help you perform quantitative analysis and better predict system behavior, establish metrics, and improve debugging skills.
- Communication: Possibly the most important soft skills across every industry, strong communication skills will help you “leverage all of the previous skills listed,” says Chen.
- Teamwork: Much like communication, teamwork is vital to a successful data science career. It requires being selfless, embracing feedback, and sharing knowledge with your team, says Chen.
Ronald Van Loon, CEO of Intelligent World, adds business acumen to the list. Van Loon says strong business acumen is the best way to channel the technical skills of a data scientist. It is necessary to discern the problems and potential challenges that need to be solved for an organization to grow.
For a deeper look at what it takes to excel as a data scientist, see “Essential skills and traits of elite data scientists.”
Data scientist education and training
There are plenty of ways to become a data scientist, but the most traditional route is by obtaining a bachelor’s degree. Most data scientists hold a master’s degree or higher, according to BLS data, but not every data scientist does, and there are other ways to develop data science skills. Before jumping into a higher-education program, you’ll want to know what industry you’ll be working in to figure out the most important skills, tools, and software.
Because data science requires some business domain expertise, the role varies by industry, and if you’re working in a highly technical industry, you might need further training. For example, if you’re working in healthcare, government, or science, you’ll need a different skillset than if you work in marketing, business, or education.
If you want to develop certain skillsets to meet specific industry needs, there are online classes, boot camps, and professional development courses that can help hone your skills. For those considering grad school, there are a number of high-quality data science master’s programs, including the following:
- Master of Science in Statistics: Data Science at Stanford University
- Master of Information and Data Science: Berkeley School of Information
- Master of Computational Data Science: Carnegie Mellon University
- Master of Science in Data Science: Harvard University John A. Paulson School of Engineering and Applied Sciences
- Master of Science in Data Science: University of Washington
- Master of Science in Data Science: John Hopkins University Whiting School of Engineering
- MSc in Analytics: University of Chicago Graham School
Data science certifications
In addition to boot camps and professional development courses, there are plenty of valuable big data certifications and data science certifications that can boost your resume and your salary.
- Certified Analytics Professional (CAP)
- Cloudera Data Platform Generalist Certification
- Data Science Council of America (DASCA) Senior Data Scientist (SDS)
- Data Science Council of America (DASCA) Principal Data Scientist (PDS)
- IBM Data Science Professional Certificate
- Microsoft Certified: Azure AI Fundamentals
- Microsoft Certified: Azure Data Scientist Associate
- Open Certified Data Scientist (Open CDS)
- SAS Certified AI and Machine Learning Professional
- SAS Certified Advanced Analytics Professional using SAS 9
- SAS Certified Data Scientist
- Tensorflow Developer Certificate
Other data science jobs
Data scientist is just one job title in the expanding field of data science, and not every company that makes use of data science is hiring for data scientists per se. Here are some of the most popular job titles related to data science and the average salary for each position, according to data from PayScale:
- Analytics manager – $100,099
- Business intelligence analyst – $70,868
- Data analyst – $62,723
- Data architect – $122,882
- Data engineer – $93,145
- Research analyst – $57,615
- Research scientist – $82,957
- Statistician – $77,545