Conquering a Universe of Data with High Performance Computing

Durham University relies on an expansive and scalable HPC storage environment for cosmological research.

cq5dam.web.1280.12801
Dell EMC

Durham University is a globally outstanding center of teaching and research excellence, specializing in the fields of astrophysics and particle physics. The university’s Institute for Computational Cosmology (ICC) is made of cosmologists working to understand big questions about the universe.

Big universe = big data

As you can imagine, running large-scale simulations of the universe means working with very large data sets. That is why these distinguished researchers rely on high performance computing (HPC) systems hosted by the ICC to calculate and model structures of our universe.

One of the largest HPC memory facilities available to Durham University researchers is the COSMA 7 cluster. Designed specifically for cosmological calculations, this memory-intensive system was spearheaded by the UK’s Distributed Research using the Advanced Computing (DiRAC) organization.

An ample storage environment for big data

COSMA 7 is made up of 452 compute nodes and offers 220TB of memory. However, a single simulation can produce hundreds of terabytes of data. The data-intensive nature of the workloads running on the cluster require a purpose-built HPC storage environment to handle ever increasing amounts of data, and the DiRAC resources team at Durham University chose Dell EMC servers, storage and Ready Solutions for HPC Storage to meet this need.

This Dell EMC HPC storage environment handles the flow of data through the COSMA 7 cluster, from scratch to archival storage and is made up of:

This combination of storage products and technologies enables the COSMA 7 team to store data efficiently and economically throughout its lifecycle, while providing ample performance for research workloads.

Faster data access, more data storage space

The fast I/O storage area network built with Dell EMC PowerVault storage provides 480TB of capacity with a maximum I/O speed of about 200GB/s. Dr. Alastair Basden, HPC technical manager at Durham University, notes in a Dell EMC case study that the scratch storage provides very fast I/O, so that “researchers can dump their data to disk very quickly and read it from disk very quickly when they need to restart. By reducing the length of time that it takes to write and restart files, we’re improving the efficiency of the cluster. That means we can really concentrate on the core computation rather than waiting for files to be saved.”

The Dell EMC storage arrays are also very dense, allowing researchers to store more data in the same footprint as the previous storage environment. According to Dr. Basden, “The old system was about 2.5 racks in size, and we replaced it with something that is about two-thirds of a rack. So, we have achieved significant space savings. More importantly, our power consumption was reduced from over 22kW to only 6kW, leading to massive savings in CO2 production.”

The Dell EMC Ready Solutions for HPC Lustre Storage with a ZFS backend for file system management delivers an array of features beneficial for HPC storage. These include snapshots, end-to-end data integrity, performance optimizations, software RAID and more.

Unlocking the secrets of the universe

Unlocking the secrets of an infinite universe requires robust HPC and HPC storage resources to process enormous amounts of data at high speeds.

“About 75 percent of the universe is made up of dark matter that we don’t understand,” Dr. Basden explains. “We know very little about it, aside from a few hypotheses and a few ideas about what it might be. By running these simulations, we are able to find out more about it. And, of course, when we do that, we begin to understand more about what the universe is made from. COSMA 7 opens all sorts of science that can be done, and that can actually have an impact over ideas that we can’t even conceive right now.”

To learn more

Copyright © 2019 IDG Communications, Inc.