Handling Unprecedented Data Demands

BrandPost By Adnan Khaleel
Nov 18, 2019
AnalyticsBig DataHadoop

Imperial College London efficiently stores, manages and protects vital research data with a future-proof high performance computing storage and data management solution.

cq5dam.web.1200
Credit: Dell EMC

Imperial College London is a one-of-a-kind university, focusing exclusively on the four main disciplines of science, engineering, medicine and business. It’s considered one of the world’s leading university research centers — and a center for high-impact research. Its research community shares ideas, expertise and technology in the pursuit of finding answers to today’s biggest scientific challenges.

Answering the needs of a robust research community

The Research Computing Service (RCS) department of the college is responsible for addressing the IT needs of the research community. The RCS team was recently challenged to find a robust, reliable storage solution that could efficiently manage and protect large volumes of research data from inception to archive.

As part of the new storage initiative, the team wanted to accomplish several goals. First and foremost was to eliminate 30 fragmented islands of storage and replace them with a single system that could be centrally managed and supported. Any solution should also guarantee consistent high performance for an excellent user experience and continue to deliver service for decades — including the ability to scale and integrate with future high performance computing (HPC) systems. The team also needed to ensure responsible data management, compliant with laws and regulations. Finally, RCS preferred a solution that enabled them to charge users by consumption rather than capacity.

A custom, collaborative solution for HPC storage

Because the requirements for guaranteed performance, scalability and integration substantially increased the complexity of the project, the RCS team turned to ArcaStream™ to help them craft a custom HPC storage solution that could seamlessly integrate with existing infrastructure and be ready to support future storage strategies.

ArcaStream helped RCS launch the Research Data Store (RDS), an HPC storage solution that runs ArcaStream’s PixStor™ framework on a Dell EMC Ready Solution for HPC PixStor Storage.

PixStor is a high‑performance shared‑disk file management software that provides fast, reliable access to data from multiple servers. It can share data using multiple protocols, including NFS, SMB, S3 and Rest. It enables seamless storage scaling, advanced search and analytics, tiering, and unified management through a single storage namespace.

The Ready Solution runs PixStor on Dell EMC PowerEdge servers and Dell PowerVault storage to deliver exceptional reliability and performance at a commodity price point. Dell EMC PowerEdge R740 Servers run and serve the filesystem to the HPC clients, as well as to the general research community via an ArcaStream NAS stack.

Excelero® NVMesh® software was used to provide a scalable NVMe tier for extreme metadata performance. And Mellanox’s Spectrum® and ConnectX® provide a hybrid network infrastructure that can deliver data via InfiniBand and Ethernet. With a 100GbE networking backbone, it allows the solution to scale easily as the university’s requirements increase.

Five petabytes and growing

The RDS provides the Imperial College London’s research community with an additional five petabytes of research storage to serve their existing 2,500 HPC nodes. They can now serve more than 3,000 desktop users simultaneously, with 20GB/s throughput and no loss of interactive user performance.

Replication of one billion files can now be performed overnight, in less than eight hours. And the Ngenea™ disaster recovery platform is set to tier colder data to external storage so the system can grow without expanding its physical footprint.

The system also provides advanced analytics capabilities so the RDS team can make informed decisions on the true cost of data and model expansion scenarios for more efficient control of expenditures and growth strategies. And better security enables the college to meet stringent data regulations.

All of this is available on a single centrally managed platform that reduces complexity, eliminates storage silos and provides greater agility and insight to manage storage capacity and performance.

Peace of mind for future expansion

The Dell EMC Ready System for HPC PixStor Storage gives the Imperial College London peace of mind that they will continue to be able to support their researchers’ data storage needs. RDS can plan for long-term storage requirements with the confidence of a system that can scale to support year-over-year multi-petabyte data growth.

To learn more