Anirban Ghoshal
Senior Writer

Databricks’ new data lakehouse aims at media, entertainment sector

Apr 25, 2022

Data lakehouses combine the storage and analytics features of data warehouses and data lakes, with vendors like Databricks offering solutions for specific industries.

Credit: Tableau Software

After launching industry-specific data lakehouses for the retail, financial services and healthcare sectors over the past three months, Databricks is releasing a solution targeting the media and the entertainment (M&E) sector.

Now generally available, the M&E data lakehouse comes with industry use-case specific features that the company calls accelerators, including real-time personalization, said Steve Sobel, the company’s global head of communications, in a blog post.

“The idea of these so-called accelerators is to provide pre-built analyses and use-case functionality to ultimately speed deployment and time to value for customers,”  said Doug Henschen, principal analyst at Constellation Research.

“You can think that the general-purpose version of the Databricks Lakehouse as giving the organization 80% of what it needs to get to the  productive use of its data to drive business insights and data science specific to the business. The idea of the industry-specific version of the Lakehouse is to get customers in specific industries, say, 90% of the way toward productive use of their data,” Henschen said.

The other 10% represents the effort of initial deployment, data-loading, configuration and the setup of administrative tasks and analysis that is specific to the customer, the Henschen said.

The data lakehouse is a relatively new data architecture concept, first championed by Cloudera, which offers both storage and analytics capabilities as part of the same solution, in contrast to the concepts for data lake and data warehouse which, respectively, store data in native format, and structured data, often in SQL format.

Features focus on media and entertainment firms

Some of the focused solutions that form a part of Databricks’ new M&E lakehouse include recommendation engines, a customer lifetime value (CLV) module, a streaming quality of service module, and toxicity detection for gaming.

While recommendation engines help create more personalized experiences for consumers with AI-powered content recommendations that drive engagement and monetization opportunities, the CLV module identifies valuable customers with models that focus on spending patterns in order to help enterprises retain users and make better marketing investments, the company said. Recommendations also include suggestions for product development choices.

“The most effective recommendation engines are very specific to industries and use cases. They require specific data inputs, models, algorithms and they deliver very specific recommendations. To deliver accurate, high-confidence recommendations is no easy task, so accelerators can provide helpful starting points for enterprises,” Henschen said.

The new data lakehouse’s features for streaming quality of service and toxicity detection for gaming are very case-specific services. While the streaming quality of service, as the name suggests, analyzes both streaming and batch data to ensure optimum, tailored content is delivered to users, the gaming-specific service uses natural language processing for real-time detection of toxic language to ensure an optimal gaming experience for users.

Partner solutions to boost functionality, adoption

As with other data lake and data warehouse providers — such as Snowflake, which also has been on an industry-focused solutions release spree — Databricks too wants to offer more functionality to its customers by partnering with other firms, which in turn is expected to boost adoption of its new lakehouse solution.

“Partnerships can be time-saving for customers as long as they introduce time-saving, pre-built integrations between the partner platforms and solutions. It’s typical for such partnerships to start with the most popular solutions in a given industry or with deepening integrations with partners already established within a given industry. The more the number of partnerships, the better it is for the solution provider,” Henschen said.
 Some of the partnerships under the M&E lakehouse solution include the company’s strategic ties with AWS, Cognizant, Lovelytics, Labelbox and Fivetran.

While the partnership with AWS is focused on providing more data and analytics capabilities for the M&E sector, the Cognizant partnership is aimed at maintaining video quality for customers.

Cognizant’s solution pairs telemetry data with artificial intelligence and machine learning to quickly identify and remedy video quality issues in real-time to solve issues such as  playback failure, delayed time-to-first-frame, or a rebuffing issue, the company said.

The company’s collaboration with Lovelytics is focused on baseball. As part of the solution, baseball team managers can optimize strategy for a game by using predictive analysis via artificial intelligence to forecast performance.

The solution also leverages bio-mechanic indicators to signal and prevent potential player injuries, the company said.

The joint solution with Labelbox is targeted toward media companies and is expected to help firms derive more value out of unstructured data.

Databricks has partnered with Fivtran to offer a data integration service which it claims can ingest data from over 180 sources including operational, ad and marketing technology solutions.