Some things are just better together: Batman and Robin, apple pie and ice cream, AI and analytics. Did that last combination surprise you? It shouldn’t. Credit: istock If you think AI and analytics are better apart, you may be missing out on some valuable opportunities. I was shocked recently while gathering background information for a short e-book I co-authored with Ellen Friedman, AI and Analytics at Scale: Lessons from Real-World Production Systems, to find that some people still think large-scale analytics projects and AI projects should be siloed and segregated. In particular, these people think of AI systems as very expensive, very specialized, separate systems, that must be completely isolated from analytics systems. My take is just the opposite. For years we’ve observed real-world enterprises across many sectors that benefit by co-locating analytics – even legacy analytics – together with modern AI and machine learning projects. Furthermore, if you can’t put analytics and AI together, it’s a signal that you don’t have a scale-efficient system, and that you have a substantial amount of avoidable technical debt. Why AI and analytics should be together Putting AI and analytics together on a shared system brings many advantages, which include: SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe Shared resource optimization: shared systems are typically more cost effective because they allow higher utilization. Sharing also makes system administration more efficient and provides a uniform security framework, which in turn reduces the burden on IT teams and improves compliance with data security standards. Data sharing: Shared systems also minimize siloed data. This is important because AI is much more effective if you have training data that shows all sides of an issue. Second-project advantage: AI and machine learning projects have high potential value, but they can be highly speculative. Leveraging existing data sets and resources means that these projects can be tested more quickly and at lower cost for failed ideas. This approach makes you more likely to find the big winners, resulting in a substantial second-project advantage. Improved collaboration: To be successful in practical business settings, AI requires expert domain knowledge, access to the right data, and a way to put results returned by models into action to address valuable business goals. A shared system built on a unifying data infrastructure together with an existing framework for taking action can give you an end-to-end pipeline for data engineers, analysts, AI experts, and business leaders that encourages valuable collaboration and makes it easier to bring AI projects into production. What you need to run AI and analytics together The advantages listed above are appealing, but what’s necessary to have a system that can support AI and analytics together? You need to develop a comprehensive data strategy, and you need scalable data infrastructure designed specifically to support scale-efficient systems. To do this, the data infrastructure should have these characteristics: It must be reliable to support multiple applications working together because the cost of any failures is multiplied The same data needs to be accessible via all of the APIs you normally use Data motion across geo-distributed systems needs to be handled at a fundamental level just as data storage is normally handled Lessons from real-world use cases: AI and analytics together Following are three use cases where combining AI and analytics paid off. Each of these customers used HPE Ezmeral Data Fabric as a software-defined and hardware-agnostic unifying data layer. • Second-project advantage pays off for a major media company We have a customer who, as a large media company, had two problems. First, they needed to refresh their system for producing business analytics. Second, they needed better viewing audience predictions to improve their advertising business. They felt that machine learning might be the key to improving these audience predictions. The easy start was to augment and ultimately replace their data warehouse with the ability to work on larger and more granular data. To do this, they made use of the data fabric, thus improving scalability and substantially reducing the cost of producing business reports. The next step was to use the data from these analytics systems to build initial versions of AI-based audience prediction, again on the data fabric. These initial systems were very basic, but still had better accuracy relative to older systems, largely because they could use more data and thus could account for more of the factors that affect viewership, such as weather, short- and long-term seasonality, and competing events. The success of these systems and the rapid ROI convinced management that it was worth investing in a data science team to build more sophisticated AI prediction systems. At this point, a second round of coattail effects of AI and analytics together began. This allowed the data science team to field many candidate predictors, ranging from incremental updates to radical new approaches. The result has been substantial further improvements over the first viewership models, and the team is looking into other opportunities to improve the business through AI. • Lunchroom collaboration made millions for large retailer A data engineer and a web product manager walked into a lunchroom one day. This isn’t the beginning of a Silicon Valley joke, but was, instead, how a new product feature got started. The product manager was lamenting that he could get their team to build a price-matching feature if only he could get web crawl data. But he couldn’t possibly get the budget to scrape the web for that data without a solid business case. The data engineer spoke up and described the web crawl that was already done – and pointed out how the resulting data was already on their shared data infrastructure. As a result, they prototyped the feature that afternoon and deployed it to production not long after. The moral is that sharing a single data infrastructure makes collaboration as easy as eating lunch. • Containerization sped up development for an AI services company Another customer used containers and shared data infrastructure to simplify delivery of AI systems together with analytics applications. There were several interesting consequences. First, because data could be shared between legacy applications and their containerized replacements, containerization could proceed one application at a time and could be substantially less invasive to the code. Second, because containers can be rebuilt in a repeatable way with explicit dependencies, the security team could automatically scan containers to find risky dependencies, and the QA team could rebuild all containers in a safe environment, thus controlling exactly which bits got into production code. The net effect was higher levels of DevOps (or even what is now called MLOps) automation. That automation means that new applications can be deployed much more quickly, which is a big win in dynamic situations. Unifying data infrastructure: HPE Ezmeral Data Fabric The key advantages of having AI and analytics projects together, as demonstrated by these three real-world use cases, depend on using data infrastructure specifically engineered for building a scale-efficient system. Download the free e-book in pdf to read over a dozen additional real-world use cases that show the competitive advantage of scale efficiency: AI and Analytics at Scale: Lessons from Real World Production Systems. ____________________________________ About Ted Dunning Ted Dunning is chief technologist officer for Data Fabric at Hewlett Packard Enterprise. He has a Ph.D. in computer science and is an author of over 10 books focused on data sciences. He has over 25 patents in advanced computing and plays the mandolin and guitar, both poorly. Related content brandpost How ML Ops Can Help Scale Your AI and ML Models Machine learning operations, or ML Ops, can help enterprises improve governance and regulatory compliance, automation, and production model quality. By Richard Hatheway Apr 07, 2022 7 mins Machine Learning IT Leadership brandpost Edge Computing is Thriving in the Cloud Era Todayu2019s edge technology is not just bolstering profits, but also helping reduce risk and improve products, services, and customer experience. By Denis Vilfort, Al Madden Apr 06, 2022 11 mins Edge Computing Artificial Intelligence IT Leadership brandpost 5 Types of Costly Data Waste and How to Avoid Them Poor choices in data infrastructure and data habits can lead to data waste u2013 but a comprehensive data strategy can help resolve the problem. By Ellen Friedman Mar 29, 2022 11 mins Data Center Management Data Architecture IT Leadership brandpost 2022 is the Year of the Edge By Matthew Hausmann Feb 28, 2022 9 mins Data Science Edge Computing IT Leadership Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe