What is a data scientist?
Data scientists are responsible for discovering insights from massive amounts of structured and unstructured data to help shape or meet specific business needs and goals. The data scientist role is becoming increasingly important as businesses rely more heavily on data analytics to drive decision-making and lean on automation and machine learning as core components of their IT strategies.
The data scientist role
A data scientist’s main objective is to organize and analyze large amounts of data, often using software specifically designed for the task. The final results of a data scientist’s analysis needs to be easy enough for all invested stakeholders to understand — especially those working outside of IT.
A data scientist’s approach to data analysis depends on their industry and the specific needs of the business or department they are working for. Before a data scientist can find meaning in structured or unstructured data, business leaders and department managers must communicate what they’re looking for. As such, a data scientist must have enough business domain expertise to translate company or departmental goals into data-based deliverables such as prediction engines, pattern detection analysis, optimization algorithms, and the like.
Data scientist responsibilities
A data scientist’s chief responsibility is data analysis, a process that begins with data collection and ends with business decisions made on the basis of the data scientist’s final data analytics results.
The data that data scientists analyze, often called big data, draws from a number of sources. There are two types of data that fall under the umbrella of big data: structured data and unstructured data. Structured data is organized, typically by categories that make it easy for a computer to sort, read and organize automatically. This includes data collected by services, products and electronic devices, but rarely data collected from human input. Website traffic data, sales figures, bank accounts or GPS coordinates collected by your smartphone — these are structured forms of data.
Unstructured data, the fastest growing form of big data, is more likely to come from human input — customer reviews, emails, videos, social media posts, etc. This data is typically more difficult to sort through and less efficient to manage with technology. Because it isn’t streamlined, unstructured data can require a big investment to manage. Businesses typically rely on keywords to make sense of unstructured data as a way to pull out relevant data using searchable terms.
Typically, businesses employ data scientists to handle this unstructured data, whereas other IT personnel will be responsible for managing and maintaining structured data. Yes, data scientists will likely deal with plenty of structured data in their careers, but businesses are increasingly wanting to leverage unstructured data in service of their revenue goals, making approaches to unstructured data key to the data scientist role.
For further insight into the working lives of data scientists, see “What does a data scientist do? 7 of these in-demand professionals offer their insights.”
Data scientist salaries
Data science is a fast growing and lucrative field, with the BLS predicting jobs in this field will grow 11 percent by 2024. Data scientist is also shaping up to be a satisfying long-term career path. In Glassdoor’s 50 Best Jobs in America report, data scientist ranked as the best job across every industry based on job openings, salary and overall job satisfaction ratings.
According to data from Robert Half's 2018 Technology and IT Salary Guide, the average salary for data scientists, based on experience, breaks down as follows:
- 25th percentile: $100,000
- 50th percentile: $119,000
- 75th percentile: $142,750
- 95th percentile: $168,000
Data scientist requirements
Each industry has its own big data profile for a data scientist to analyze. Here are some of the more common forms of big data in each industry, as well as the kinds of analysis a data scientist will likely be required to perform, according to the BLS.
- Business: Today, data shapes the business strategy for nearly every company — but businesses need data scientists to make sense of the information. Data analysis of business data can inform decisions around efficiency, inventory, production errors, customer loyalty and more.
- E-commerce: Now that websites collect more than purchase data, data scientists help e-commerce businesses improve customer service, find trends and develop services or products.
- Finance: In the finance industry, data on accounts, credit and debit transactions and similar financial data are vital to a functioning business. But for data scientists in this field, security and compliance, including fraud detection, are also major concerns.
- Government: Big data helps governments form decisions, support constituents and monitor overall satisfaction. Like the finance sector, security and compliance are a paramount concern for data scientists.
- Science: Scientists have always handled data, but now with technology, they can better collect, share and analyze data from experiments. Data scientists can help with this process.
- Social networking: Social networking data helps inform targeted advertising, improve customer satisfaction, establish trends in location data and enhance features and services. Ongoing data analysis of posts, tweets, blogs and other social media can help businesses constantly improve their services.
- Healthcare: Electronic medical records are now the standard for healthcare facilities, which requires a dedication to big data, security and compliance. Here, data scientists can help improve health services and uncover trends that might go unnoticed otherwise.
- Telecommunications: All electronics collect data, and all that data needs to be stored, managed, maintained and analyzed. Data scientists help companies squash bugs, improve products and keep customers happy by delivering the features they want.
- Other: There isn’t an industry that is immune to the big data push, and the BLS notes that you’ll find jobs in other niche areas, like politics, utilities, smart appliances and more.
Data scientist skills
According to William Chen, Data Science Manager at Quora, the top five skills for data scientists include a mix of hard and soft skills:
- Programming: Chen cites this as the “most fundamental of a data scientist’s skill set,” noting it adds value to data science skills. Programming improves your statistics skills, helps you “analyze large datasets” and gives you the ability to create your own tools.
- Quantitative analysis: An important skill for analyzing large datasets, Chen says quantitative analysis will improve your ability to run experimental analysis, scale your data strategy and help you implement machine learning.
- Product intuition: Understanding products will help you perform quantitative analysis, says Chen. It will also help you predict system behavior, establish metrics and improve debugging skills.
- Communication: Possibly the most important soft skills across every industry, strong communication skills will help you “leverage all of the previous skills listed,” says Chen.
- Teamwork: Much like communication, teamwork is vital to a successful data science career. It requires being selfless, embracing feedback and sharing your knowledge with your team, says Chen.
For a deeper look at what it takes to excel as a data scientist, see "Essential skills and traits of elite data scientists."
Becoming a data scientist
If you don’t have a background in computer science or data analytics, boot camps, degree programs or certifications can provide the skills necessary to transition to being a data scientist.
You’ll want to figure out if the job openings in your desired industry and field require a higher education degree, or if certifications and boot camps are enough to satisfy a hiring manager. Spend some time researching job openings to find commonalities in your desired position. From there, you can map out a strategy to becoming a data scientist armed with the education, skills and experience to get the job.
Data scientist education and training
There are plenty of ways to become a data scientist, but the most traditional route is by obtaining a bachelor’s degree. Most data scientists hold a master’s degree or higher, according to BLS data, but that isn’t the case for every data scientist, and there are other ways you can develop data science skills. Before you jump into a higher-education program, you’ll want to know what industry you’ll be working in to figure out the most important skills, tools and software.
Because data science requires some business domain expertise, the role of a data scientist will vary depending on the industry, and if you’re working in a highly technical industry, you might need further training. For example, if you’re working in healthcare, government or science, you’ll need a different skillset than if you work in marketing, business or education.
If you want to develop certain skillsets to meet specific industry needs, there are online classes, boot camps and professional development courses that can help hone your skills.
Data science certifications
Some popular data science certifications include the following:
- Certified Analytics Professional (CAP) – The Cap Program
- Certified Specialist in Predictive Analytics (CSPA) – The CAS Institute
- Cloudera Certified Professional: CCP Data Engineer – Cloudera
- Data Science Certificate – Harvard Extension School
- DASCA Data Science Credentials – Data Science Council of America
- IAPA Analytics Credentials – IAPA
- SAS Academy for Data Science – SAS Institute
- SAS Certified Big Data Professional/Data Scientist – SAS Institute
- Simplilearn Data Science Certification Training – Simplilearn
- Teradata Aster Analytics Certification – Teradata
Data science degree programs
If you want to go the traditional degree route, there are plenty of master’s programs in data science to choose from. Even without a science-related undergraduate degree, you can still apply for a master’s program in data science, but it might require additional credits, exams or computer science experience.
According to US News and World Report, these are the top graduate degree programs in data science:
- Master of Science in Statistics: Data Science at Stanford University
- Master of Information and Data Science: Berkeley School of Information
- Master of Computational Data Science: Carnegie Mellon University
- Master of Science in Data Science: Harvard University John A. Paulson School of Engineering and Applied Sciences
- Master of Science in Data Science: University of Washington
- Master of Science in Data Science: John Hopkins University Whiting School of Engineering
- MSc in Analytics: University of Chicago Graham School
Other data science jobs
Data scientist is just one job title in the expanding field of data science, and not every company that makes use of data science is hiring for data scientists per se. Here are some of the most popular job titles related to data science and the average salary for each position, according to data from PayScale:
- Analytics Manager - $92,249
- Business Intelligence Analyst - $66,003
- Data Analyst - $57,768
- Data Architect - $112,790
- Data Engineer - $90,811
- Research Analyst - $52,970
- Research Scientist - $77,330
- Statistician - $71,374
If you are looking to establish a career in data science, these are some positions you may also want to consider. Data analytics is an expansive field, so you’ll want to figure out what your niche is before you start applying for jobs. Once you know how you want to work with data, it will be easier to narrow down the best job openings to match your skillset.