When technology companies develop innovative new products in the high-performance computing (HPC) space, it allows life science researchers to do new things that they hadn’t imagined before. And when life science researchers make new breakthroughs, it pushes information technology to innovate new approaches to support those scientific advances. It’s a relationship that has driven innovation in the life sciences industry for years.
This mutually beneficial cycle is particularly evident in the work of the Center for Quantitative Life Sciences (CQLS) at Oregon State University. The center supports more than 26 different departments, providing laboratory equipment, computing infrastructure, training, and access to staff members with extensive experience in genomics, bioinformatics, and computational biology.
Christopher Sullivan, the assistant director for biocomputing at CQLS, recently sat down for an interview where he discussed some of the center’s most exciting new projects. Several of those projects use machine learning models to perform advanced analytics running on the center’s HPC clusters.
Tracking Owls by Sound
Working with the U.S. Forest Service, researchers at Oregon State have developed algorithms that can identify different species of owls from sound alone. They set up recording stations in the woods to capture audio files. They then generate spectrograms — visual representations of the audio inputs — which they analyze using a machine learning model they have trained to recognize distinct species.
In the beginning, the models could only identify a handful of species. But the team saved their audio recordings over the years, about 5 PB captured since the 1990s. In the intervening years, the scientists developed more models, so that today, the project can identify more than 50 different species. At the same time, the technology has advanced, allowing the scientists to re-run that old data more quickly than in the past.
“You give me a new technology, I’m going to push as much data out of that technology as I possibly can,” Sullivan said. “It’s about building the tools that we need alongside the data as it comes at us. And as those tools change, we’re able to go back to the data and redo it.”
This scientific and technological breakthrough is having a meaningful impact on both the economy and the environment. “It’s helping the public monitor the owl populations and all the other animal populations in the forest so that groups can farm the forests properly for wood and different things without harming species,” Sullivan explained.
Monitoring Covid in Wastewater
A lot of the work at CQLS involves genetic sequencing. For example, they regularly analyze the genetic material in wastewater so that they can determine which Covid variants are most prevalent at any given time.
This is an area where a scientific breakthrough drove advances in computing. Sullivan explained that a new genome sequencing innovation enabled the CQLS to scale up so that instead of doing 2,000 sequences per run, they were able to conduct 2 million sequences per run. And it happened “literally overnight.”
But the computing infrastructure wasn’t designed for that kind of throughput. “So we actually had to go back and develop a whole new stack because the technology changed,” Sullivan said. “We’re constantly doing that at all times.”
Identifying Plankton by Laser Light
Another CQLS project is using genetic sequencing to monitor the health of the ocean. Oregon State’s Hatfield Marine Science Center is working with the National Oceanic and Atmospheric Administration (NOAA) on a project that analyzes they plankton in seawater. They have a ship that drags a laser device behind it to capture images of the ocean. As the lasers pass over the water, the plankton and other organisms in the water cast a shadow. The team records video of those shadows, which they then analyze with HPC systems powered by GPUs to classify the contents.
That effort generates a tremendous amount of data — on the order of 100 TB per week. That was more than the CQLS could affordably store and process in the cloud. However, the center’s HPC environment, built with Dell infrastructure featuring NVIDIA GPUs, provided the right balance of performance and cost.
Projects like these are helping Oregon State learn more about the world in which we live and develop innovative new approaches to computing at the same time. Their efforts are improving life and health while also pushing the boundaries of what’s possible with high-performance computing.
For more information, read the OSU customer story here.
Intel® Technologies Move Analytics Forward
Data analytics is the key to unlocking the most value you can extract from data across your organization. To create a productive, cost-effective analytics strategy that gets results, you need high performance hardware that’s optimized to work with the software you use.
Modern data analytics spans a range of technologies, from dedicated analytics platforms and databases to deep learning and artificial intelligence (AI). Just starting out with analytics? Ready to evolve your analytics strategy or improve your data quality? There’s always room to grow, and Intel is ready to help. With a deep ecosystem of analytics technologies and partners, Intel accelerates the efforts of data scientists, analysts, and developers in every industry. Find out more about Intel advanced analytics.