The business information (BI) market is fierce and crowded. Historically, the big players — think Oracle and IBM — engaged in feature wars to try to justify budget-stretching (if not budget-busting) pricing, and relied heavily on high-touch salesmanship. To make matters worse, the vendors expected your IT department to work with the vendors’ own consultants to configure their products and integrate them with each of your systems of record, often at additional cost.
Once a traditional BI system was installed and running, managers had to wait for weekly or monthly line-of-business reports, meaning decisions often took a month, plus another month to implement. Adding a report required a request to woefully-backlogged IT, and could take weeks or months to design and code.
That all changed with the 2004 introduction of self-service BI, exemplified by the five top self-service platforms I cover in a companion comparison: Domo, Power BI, Qlik Sense, QuickSight, and Tableau. The transition to self-service BI was in part fueled by the ability to make business decisions in days rather than months. Of late, the availability of cloud computing and high-speed internet access have been key technical drivers of self-service BI.
Of course, traditional BI is still alive and well, although somewhat diminished. Financial reporting in particular requires 100% accuracy and usually allows weeks for producing reports. Reporting turns out to be a separate use case that may not always be well-served by self-service BI products, which emphasize interactive visual discovery, although some of the newer platforms attempt to completely replace traditional BI systems.
Meanwhile, traditional BI platforms are also evolving. Some have added enough self-service, visual discovery, and analytics to satisfy the needs of existing customers.
Criteria for picking a self-service BI platform
Performing your own evaluations when selecting a self-service BI platform is key, since many of the features the vendors tout may not have real benefits for your enterprise. For example, if your company already has a high-performance data lake, you may not want to pay a differential for a BI platform that imports all data into its own store. Similarly, you may prefer to integrate the BI system with a collaboration platform already in place rather than use a dedicated BI collaboration feature, since asking employees to use two collaboration systems is generally a non-starter.
If most of your data is on Azure, you might want to rule out BI systems that run only on Amazon Web Services, and vice versa. If possible, you want the data and the analysis to be collocated for performance reasons.
Vendors tend to cite analyst reports that are most favorable to their product. Don’t trust the vendor’s skimmed abstract or take the diagram they show you at face value: Ask for and read the whole report, which will mention cautions and weaknesses as well as strengths and features. Also take the fact of inclusion in an analyst’s report with a large grain of salt: Most big analysis firms take more interest in paying customers than in vendors that are not their clients, despite the individual analysts’ sincere attempts to be fair and neutral.
Following are seven key areas of concerns when evaluating self-service BI platforms.
You need to ensure that a BI platform can read all your data sources. Second, you’ll want to know whether the platform has to import data into its own store before processing it, or if it can process data queries on the fly.
If it has to import data, is the analysis speed fast enough to justify the import time? Can the BI system automatically update the data from the original source?
If there is a charge for data storage in the BI system, take your wildest guess about how much data you’ll have in 5 years and triple it. Would the cost to store that amount affect your budget?
Another key question: Can the BI system run where your data resides? If not, how hard would it be to move your data?
Data is always dirty when you collect it. Fields may be missing from a row, or may contain nonsensical values. Multiple fields within a row may have mutually inconsistent values. Text fields may contain misspellings, spelling variants, or variations in terminology that keep them from being grouped together automatically. Some fields, especially free-form comments, may be very long and of little use.
Furthermore, fields may be non-parametric (text) and need to be encoded as numbers for analytic purposes, although some BI systems automate this internally. Numeric ranges of fields may differ by orders of magnitude and need to be normalized. Values may need to be inferred from other values, for example sex may need to be inferred from first names and/or titles for statistical purposes if not already present in the source data.
A BI system may require you to write SQL SELECT statements, or it may perform imports itself. If it requires you to write database queries, does it assist you in picking fields and creating joins?
These and other concerns mean you should try out a BI system on some of your data. Build an extract/transform/load chain while looking at and graphing your data. See how easy or hard it is. Compare that to other BI systems. Don’t underestimate the time you’ll need to spend cleaning up data for analysis — it can easily account for 80% of total analysis time.
You’ll want to analyze cleaned data in several ways. At the simplest level, you’ll plot data in various formats, and perform straightforward statistical analysis on historical data and trends. Beyond that, you’ll want to dig into the data to understand specific features, and build models to test your ideas about causes. Finally, you may want to predict future performance indicators (sales and inventory requirements, for example) based on statistical models and even machine learning models.
One feature war you’ll encounter is in the number of chart types provided. This is often meaningless when a hyped chart type doesn’t apply to your data. On the other hand, some chart types are important: For example, I would be reluctant to use a system without geographic display support, as seeing raw numbers in a table of locations doesn’t have the same visual and intuitive impact as seeing different colors or varying bubble sizes on a map.
Support for analysis is another feature war. Yes, you absolutely should be able to perform simple statistics within the BI platform, at least up to and including regression models. To go much further may be an impedance mismatch with users.
For example, adding machine learning and deep learning support to the options for exploratory BI analysis may be a bridge too far for managers and business analysts. Data scientists are another story, but they typically have dedicated, specialized workspaces for creating ML models and deep neural networks, using workflows that often require a great deal more statistical knowledge and programming skill than the typical BI user possesses.
On the other hand, natural language support and built-in intelligence for analyzing common data patterns make a platform easier for unsophisticated users. Applying machine learning to the user experience is often good, even if asking business analysts to train deep learning models may be fruitless.
Some BI platforms now use in-memory databases and parallelism to accelerate queries. In the future, you may see more highly parallelized GPU-based databases built into BI services — third parties are building these, demonstrating impressive speedups.
You will often need to revise or augment data transformations during analysis, for example by adding columns that reflect differences or ratios between other columns as is often done in financial analysis (e.g. debt/equity). Such revisions can sometimes change the import process from an ETL (extract, transform and load) pipeline to ELT (extract, load, and transform). Some vendors support only one of ETL or ELT, but most BI systems that use ETL have provisions for additional transformations in the analysis step.
Ease of learning and use
Despite being aimed at managers and business analysts, self-service BI systems are complicated and have many moving parts. The quality of the user experience and learning materials varies widely among the BI platforms I have tried. Try to involve several potential users at various skill levels in your evaluation to see how they react. Also be sure to test the documentation itself. There’s an enormous difference between the best documentation search, indexing, and organization and the worst. I have at times been reduced to asking a sales engineer to find a tutorial for me after a significant but failed personal effort.
Some BI systems show reasonably informative charts for practically any choice of variables. Other BI systems wait for you to click on exactly the type of chart you think you want to see. If you know what you want and need, either approach will do; if not, it’s better if the system offers help based on the number and kind of variables you’ve chosen.
Often BI systems distinguish between measures, which are always numeric, and dimensions, which can be non-parametric. Some sets of dimensions, for example City, State, and Country, can be transformed into measures such as latitude and longitude. Sometimes you want to see measures qualified by dimensions, e.g. “show me our profit ratios by product” or “show me our year-over-year sales by store,” and other times you want to see measures qualified by other measures, e.g. “show me profits versus sales for all stores in the Midwest.”
Once you have seen a meaningful graph, you often want to zoom in on specific features, tune the display, and add annotations. BI systems differ quite widely in this area, so it’s worth doing the exercise.
Exactly what you can share varies from system to system, and by whether you want to share with fully licensed users, read-only registered users, or unregistered users. In some cases, read-only users can sort and sift data from charts you supply; in other cases, they can only see slide shows made from your analyses.
This distinction often has a large effect on whether you will be able to afford the BI product for your whole company or only a select audience, coupled with the pricing, of course.
Costs and benefits
By costs I don’t mean only the vendor’s yearly fees, but also the costs to store your data, host the platform on-premises or in the cloud, and train your people. Benefits include reduced labor and time to reach decisions, better decisions, and ultimately improved profits and growth.
For an in-depth comparison of the best self-service business intelligence tools on the market, see “The 5 best self-service BI tools compared,” where I break down the pros, cons and use cases for Domo, Power BI, Qlik Sense, QuickSight and Tableau.