The Business Problem
The first thing to consider when starting an analytics practice is your core business. Are you in manufacturing or healthcare? Are you a service provider or a pharmaceutical startup? Below are a few examples that show how data analytics could benefit certain industries.
- Manufacturing: If you are in the business of manufacturing, there are several problems that data analytics can help with, including increasing productivity, reducing labor costs, and preventative maintenance. Predictive maintenance allows preventative maintenance to occur. With the ability to gather, store and process data coming from various equipment, software can perform correlations and can observe trends that can predict when equipment might need preventative maintenance. To perform this kind of analytics, a company would likely need to gather and store all available data points from all available equipment.
- Software: Many software companies want to understand how customers use their products. If a company’s product can generate statistics on usage – as most apps do today – then it can collect, store and process this data to gain analytic insights. For example, companies can correlate usage data with social media information to understand if customers engage with people to discuss how much they like or dislike the company’s product. This particular analytic can help companies tailor marketing campaigns based on customer sentiment.
- Start-ups: A start-up in the human resources analytics space is effectively in the analytics business. Or maybe a company is building the next wiz-bang social media platform, which almost necessitates that the company will use analytics to help drive content and user behavior on its platform.
At Secureworks, our core business is to provide Managed Security Services. Our primary goal is to provide intelligence-driven security solutions and expertise to our clients. In 2012, we saw a need to form a data analytics team to design and build a new product. Our service delivery platform consumes as many as 240 billion events day and leverages our intelligence gained over 18 years of processing and handling events. Therefore, we have a lot of data we can use for research, prototyping and development.
The People Needed
You don’t need a Ph.D. to understand the innermost workings of machine learning. Many free online resources provide opportunities to learn data science, machine learning, analytics and computer science, so advanced degrees are not necessary.
Become familiar with the notion of “T-shaped” skills. The top of the “T” is a series of skill sets for a particular domain. The vertical bar represents depth in a particular skill. In data analytics teams, members need to have a breadth of skill sets, as well as deep knowledge in a particular area. To ensure the team works efficiently, you will need members with different areas of expertise. Forming a team with T-shaped skills helps ensure the team can work together and apart equally well. A good book on understanding “T-shaped” skills is Analyzing the Analyzers.
The most common skills found on data analytics teams are as follows:
- Data manipulation
- Exploratory data analytics
- Mathematics / Statistics
- Business Skills / Domain expertise
- People Skills
It is important to note that some people will have capabilities in more than one area. For example, you might have a statistics wiz on the team who can also write software. Having team members who have business skills and/or domain expertise is vital because any successful project will require working with external teams, stakeholders and personnel in order to solve a problem or data-analytic need.
When forming our team, we sought doctorate-level individuals who were experts in machine learning and statistics. These experts needed only enough programming skills to conduct their experiments. We also hired software programmers. We eventually realized all we needed were a few folks with machine learning knowledge and the ability to produce software quickly.
The Technology Problem
Assuming you already have a big data platform in place, the next area to address is analytic technology. This has to do with the selection of a toolset to conduct data analytics. Many big data platforms come with various analytical and machine learning toolkits that either are built in or are integrated directly into the platform. Once you have your data lake, you need to ask some questions:
- Do you have access to the data you need? If not, think about what is needed to access it, such as access control changes, auditing oversight and possible additional personnel training.
- Is accessing data prohibited by contractual obligations to ensure only certain people can access customer data?
- What analytics tools do we have today?
People tend to exaggerate the need for things like machine learning and artificial intelligence when it comes to data analytics. Often simple statistics (counting, averages and standard deviations) will provide you the answer you need for a particular problem.
If you don’t have a big data lake and fancy analytical tools, you should never underestimate Microsoft Excel’s ability to aid in analytical problem solving, even for relatively large data sets. Excel is a powerful tool. If you want to learn how to do machine learning and analytics without needing to know mathematics, read Data Science and Machine Learning Without Mathematics.
In 2012, the landscape of available tools was not as it is today. We used standalone open-source tools to aid in the analytic-heavy aspects of the work and then implemented our algorithms in custom software. Today, many software tools and packages make it easier and faster to perform analytics.
The Process Problem
The final area we need to address is processes. Process is a way of life in IT, but having too much or too little can make or break a project. Below are some process-related items to consider when you are creating a new data analytics team:
- Is there an existing process for working with internal/external stakeholders?
- Do you have a data governance process?
- Is there buy-in from top-level leadership for a new practice?
The analytics team will have to work with internal and external stakeholders to understand the parameters of the problem and to obtain subject matter expertise (SME). It is better to do some legwork up front, identify the players – both on the analytics team and from other parts of the business – and establish a recurring check-in meeting where data analysts can ask questions of the SMEs, and SMEs and stakeholders can provide answers.
The other process point that bears discussion is that of data governance. It is important because often the results of a particular data analytic project, especially a successful one, will usually elicit a “show me” attitude from stakeholders. If you have a governance process that defines the documentation and communication policies of the “who, what, where and why” of how you got your results, it will be easy to share information with stakeholders. Trying to share information without documentation you gathered during the project can be hard or impossible.
Creating a data analytics practice requires attention to some key areas in order to be successful. If you ask the right questions up front, you will reduce the pain of establishing your team. Below summarizes the key points:
- Know your core business and understand the types of problems an analytics team could solve.
- Know what key skills will be needed for a data analytics team, and know whether or not you already have them on your team.
- Understand that sometimes you need fancy algorithms or tools in order to solve a data analytics problem.
- Implement processes to help you work or communicate with stakeholders.
- Obtain buy-in from top levels of executive leadership. Without buy-in you are likely to be unsuccessful, as managers and individual contributors at lower levels will not see data analytics as a priority and may refuse to engage or help.
At Secureworks, we utilized data analytics, big data, machine intelligence and human intelligence to build a machine learning system that integrates with our service delivery platform. The platform can now automatically make decisions as to the disposition of a piece of data, without the need for custom rules or code. The machine learning used today better protects our clients and allows or Security Operations Center (SOC) analysts to be more efficient.