CERN's Search for God (Particles) Drives Massive Storage Needs
Think your storage headaches are big? Try being the guy in charge of storing the 1GB of data per second every day for a month coming off CERN's large hadron collider (LHC).
The solution he chose was a 4Gbps Fibre Channel SAN, using a clustered file system. Why a clustered file system? "We didn’t want storage strictly linked with a hardware vendor," he says.
For the clustering, the team is using Quantum's StorNext software as its file system. "Performance was our number-one concern," he says. The second concern was flexibility. "This large buffer means a lot of hardware," he says. "StorNext makes our SAN much more flexible. You can work with different hardware technologies, and it's completely vendor independent." Finally, CERN had to ensure scalability. "It's often the case that physics experiments are upgraded and the system must be able to evolve," he says.
File systems such as StorNext, which let users share files across multiple platforms (i.e., they don’t care if a Windows server or a Solaris server needs to access a particular file), are growing in popularity, after being more common in the research and university environments, says Noemi Greyzdorf, a research manager who follows storage software for IDC (a sister company of CIO.com’s publisher, CXO Media). Clustered file systems are still an evolving category, she says, but enterprise IT is warming up to it.
"There's been a push toward clustered file systems in the enterprise," she says. "With the growth of unstructured data, there's an increasing need for a centralized way to manage it."
Other options in this category include IBM's General Parallel File System and Symantec's Veritas Storage Foundation Cluster File System. StorNext is known in the category for its performance, data sharing across operating systems and moving data across tiers, Greyzdorf says.
The data acquisition team for ALICE particularly valued StorNext's affinity feature, Vande Vyvre says, because it lets the team avoid data being written to the same disks at the same time, which would cause performance issues. "This is what we used to keep the data traffic separated in streams, to avoid any slowdowns," he notes.
As for Vande Vyvre's tips to CIOs considering clustered file systems, he says don’t underestimate the value of the hardware vendor flexibility for drives and arrays. "We are quite happy to have stuck to our wish for a system that is vendor independent. The experiment will have a lifetime of 10 years. We know in principle we can keep the file system during that time," he says.
CERN



