Distributed analytics meets distributed data

Dell EMC

Distributed analytics meets distributed data

By Patricia Florissi, Ph.D.

To illustrate the power of the concept of distributed, yet collaborative, analytics in-place at worldwide scale, it sometimes helps to begin with an example. In this case, I will start with an example from the healthcare industry, and then dive down into discussion of the World Wide Herd (WWH), a global virtual computing cluster.

Hospitals around the world are moving to value-based healthcare and achieving dramatic reductions in costs. One way to achieve these goals is to make more effective and efficient use of expensive medical diagnostic equipment, such as CT scanners and MRI machines. When a hospital maximizes its utilization of these devices, it increases its ROI and potentially reduces its costs by avoiding the need to buy additional devices. In principle, it is contributing to more affordable care.

With a focus on value-based healthcare, Siemens Healthineers, the healthcare business of Siemens AG, is developing a global benchmarking analytics program that will allow its customers to see and compare their device utilization metrics against those of hospitals around the world. The goal is to help hospitals identify opportunities to gain greater value from their investments.

This global benchmarking analytics program will be offered via the Siemens Healthineers Digital Ecosystem, a digital platform for healthcare providers, as well as for providers of solutions and services, aimed at covering the entire spectrum of healthcare. The platform, announced in February 2017, will foster the growth of a digital ecosystem linking healthcare providers and solution providers with one another, as well as bringing together their data, applications and services.

Global benchmarking analytics in the Siemens Healthineers Digital Ecosystem will be powered by the innovative Dell EMC World Wide Herd technologies, enabling the Internet of Medical Things (IoMT) for several healthcare modalities. Dell EMC’s collaboration with Siemens delivers the ability to analyze data at the edge, where only the analytics logic itself and aggregated intermediate results traverse geographic boundaries to facilitate data analysis across multi-cloud environments—without violating privacy and other governance, risk and compliance constraints.

How it works                                                                                     

The WWH concept, which was pioneered by Dell EMC, creates a global network of Apache™ Hadoop® instances that function as a single virtual computing cluster. The WWH orchestrates the execution of distributed and parallel computations on a global scale, across clouds, pushing analytics to where the data resides. This approach enables analysis of geographically dispersed data, without requiring the data to be moved to a single location before analysis. Only the privacy-preserving results of the analysis are shared.

Let’s take a closer look at how the WWH enables distributed, yet collaborative, analytics at a global scale. First, WWH distributes computation across a virtual computing cluster and pushes analytics to its virtual computing nodes. In the case of Siemens, each virtual computing node is implemented by a cloud instance that collects and stores data from Siemens’ medical devices in local hospitals and medical centers.

42517 image 1 Dell EMC

Second, computation takes place, in real-time, where the data resides.

42517 image 2 Dell EMC

Third, only the privacy-preserving results are sent back to the initiating location, where they are aggregated, and a global analysis is performed on these results. In the case of Siemens, each virtual computing node calculates a local histogram and sends it back to the initiating node, which combines all histograms together to provide global benchmarking. A hospital administrator looking at the global histogram can immediately gain insights on the performance of this one hospital compared to all the other hospitals in the world.

42517 image 3 Dell EMC

A WWH can have multiple configurations. The virtual computing nodes can be clouds in a multi-cloud environment or an Internet of Things (IoT) gateway in a multi-IoT gateway environment, where analytics is pushed directly to the gateways themselves.

42517 image 4 Dell EMC

In its ability to pair distributed processing and analytics with distributed data, the WWH overcomes several pressing IT issues. It helps organizations address the challenges of:

  • An explosion in the numbers of connected devices and the volumes of IoT data that defy the scalability of centralized approaches to store and analyze data in a single location
  • Bandwidth and cost constraints that make it impractical to move data to central repositories
  • Security concerns for data in transit
  • Regulatory compliance issues that limit the movement of data beyond certain geographic boundaries

The bigger picture

When you study these and other challenges, you see that we are in the middle of a perfect storm that is disrupting the status quo. Increasingly, we need to take the processing power and analytics to the data, rather than vice-versa. This is very much the future for many industries as we look to a world that is projected to have 200 billion connected devices in 2031. Data will increasingly be inherently distributed and inherently federated with limited data movement.

While the example I have used here focuses on a specific use case in the healthcare industry, the WWH concept can be applied across a wide spectrum of industries. In a December blog post, I explored the potential to use a WWH to advance disease discovery and treatment by enabling global-scale collaborative genomic analysis research. And, of course, WWH approaches can and will be used to help companies gain value from data spread across the IoMT and IoT in general.

At the end of the day, rich insights can be obtained when the domain of the data analyzed transcends geographical, political, and organizational boundaries, and can be analyzed as one virtual cohesive dataset. That’s the World Wide Herd in action.

  • For a closer look at the Siemens Healthineers Digital Ecosystem and its many partners, visit siemens.com/healthineers-digital-ecosystem.
  • For a deep dive into the IoMT, join us at Dell EMC World on May 9 for an interactive panel discussion titled, “Advancing the Promise of the Internet of Medical Things and Connected Health.”
  • To explore Dell EMC solutions for data analytics challenges, visit DellEMC.com/BigData.

Patricia Florissi, Ph.D., is vice president and global CTO for sales and a distinguished engineer for Dell EMC.

Copyright © 2017 IDG Communications, Inc.