For the past few years, Hadoop has been gaining traction in the enterprise not only for its capability to store massive amounts of data, but with the hope that it can help companies scale their data warehouses and power business intelligence (BI). But enabling interactive queries on Hadoop using standard BI tools — tools in which many companies have already heavily invested — has proved a challenge.
Hadoop is hugely flexible and scalable, but it wasn’t built to support the kind of interactive query performance that business users expect from their BI tools. Organizations have relied on data indexing, transformation or data movement methodologies — typically complex and time-intensive — to get around this.
In addition, connecting BI tools to Hadoop data has required custom drivers — some BI tools use SQL (like Tableau) and others leverage MDX (like Excel). To make matters more complicated, most enterprises use a multitude of BI tools: some departments may be using Tableau while others rely on Cognos or MicroStrategy. IT has to support the various departments’ use cases and their tools of preference.
[ Related: How different SQL-on-Hadoop engines satisfy BI workloads ]
Yesterday, startup AtScale, which specializes in BI on Hadoop using OLAP-like cubes, moved to change all that with its AtScale Hybrid Query Service.
“Whether it’s a SQL query or an MDX query, we’re converting that to the exact same SQL-on-Hadoop query underneath,” says Bruno Aziza, CMO of AtScale.
AtScale’s answer to Hadoop’s interactive query performance is to create virtual cubes that essentially turn Hadoop into a high-performance OLAP server — scale-out architecture but with an OLAP interface. It uses what AtScale calls “BI Server Impersonation” to represent the virtual cube as a Hive server to the BI tools.
Driverless MDX and SQL support
The new Hybrid Query Service adds the capability to support MDX and SQL natively, without having to download new clients or customer drivers to end-user machines.
The AtScale Hybrid Query Services is part of AtScale Intelligence Platform 4.0 also announced on Tuesday. The 4.0 version adds a host of features to meet enterprise requirements for security and governance.
[ Related: Cloudera unveils in-memory store, security layer for Hadoop ]
Josh Klahr, vice president of Product Management at AtScale, says that enterprises seeking to turn Hadoop into a true analytical data warehouse need to overcome difficult governance, performance and security challenges. The latest version of the platform adds True Delegation capabilities to ensure that every query executed on the Hadoop cluster is associated with the end user who generated the query. The platform works with Apache Sentry, Apache Ranger and fully supports LDAP, Active Directory and Kerberos in an effort to accommodate the most extensive enterprise requirements.
AtScale 4.0 also features application-level role-based access controls that can be automatically synchronized with security groups managed by LDAP or Active Directory. This ensures that only authorized users can view, query and modify AtScale virtual cubes.
“We’ve added a lot of functionality around the kind of security our customers expect,” Klahr says.