by Thor Olavsrud

How Big Data Is Helping to Save the Planet

Sep 15, 20145 mins
AnalyticsBig DataData Mining

With the help of HP's Vertica Analytics Platform, Conservation International is crunching data from images to monitor the biodiversity of tropical forests around the world.

Credit: ThinkStock

A little more than a year ago, Conservation International (CI) was wrestling with a mammoth big data problem.

The nonprofit environmental organization’s mission is to protect nature and its biodiversity, but monitoring and analyzing the health of species — particularly in the tropical forests that half of all plants and animals on earth call home — was a manual and labor-intensive process.

At 16 sites across four continents, CI had established a network of 1,000 camera traps — cameras equipped with motion sensors that trigger when animals pass through their field of view. Set up over 2,000 square kilometers at each site, the camera captures images of passing fauna in an effort to synthesize and understand the effects of climate change and land-use change on tropical terrestrial mammal and bird diversity.

Checking the Planet’s Vital Signs

“You and I go to the doctor and we get our vital signs checked — our temperature, our blood pressure,” says Sandy Andelman, chief scientist at CI. “Well, we need those sorts of vital signs for the planet, and that’s what we’re really trying to do at Conservation International with TEAM [Tropical Ecology, Assessment and Monitoring Network] and with other programs.”

“What we do is we put camera traps, so it’s kind of like Candid Camera,” she adds. “We put these traps all through the forest, and it allows us to find out what’s there and what the animals are doing.”

Because the camera traps are located in some of the more remote locations on earth, there’s no infrastructure. Teams have to manually collect the data from the traps and upload it, at which point CI scientists run a series of scripts and models to identify the various species appearing in the images. They then blend that data with climate measurements (precipitation, temperature, humidity, solar radiation, etc.), data on trees (growth, survival, deforestation, etc.) and land use data from public sources to create a model of the health of the animal populations at the sites and how they are changing over time.

“Everything is connected in the world,” says Jorge Ahumada, acting executive director of the Tropical Ecology, Assessment and Monitoring (TEAM) Network at CI. “Nature doesn’t live in countries; nature lives as a unity. If we want to preserve this world, and we want to do it in a smart way, we need to be able to assess how we’re doing and react quickly.”

But this process wasn’t quick. About a year ago, TEAM was collecting about a million images of data every year. The number is now more than two million. They crunched the data with their own computers at the CI office and distributed much of the work manually. The process of analyzing the data could take weeks, months or more.

“If we wanted to run an iteration of one of our indices, it would take several weeks,” Ahumada says. “We knew what we had to do in terms of the code and data science of the problem, but we didn’t have the scale to implement it quickly.”

And as Ahumada notes, time is of the essence. The tropical forests CI monitors are believed to be home to half of all plants and animals on earth and to generate 40 percent of the planet’s oxygen. But 4.6 million hectares of tropical forest — about 18,000 square miles — are disappearing every year, according to the United Nations Environment Programme.

HP and Big Data Analytics Answer Call of the Wild

Enter Hewlett-Packard. In December of 2013, HP joined forces with CI to create HP Earth Insights, a program through which HP outfitted CI with its HP Vertica Analytics Platform (a cornerstone of its HAVEn Big Data solution).

While trap data still has to be collected manually, HP has helped CI make the data analysis nine times faster and more accurate to boot. HP Enterprise Services software engineers built the Wildlife Picture Index (WPI) Analytics System, a project dashboard and analytics tool for visualizing user-friendly, near real-time data-driven insights.

The WPI Analytics System uses Vertica and R analytical software to estimate species occupancy at a given site using species presence-absence data (by processing raw camera trap data) and “covariates” (climate, forest edge and human presence data).

“Really, in the environmental sciences, we have not harnessed the power of big data,” Ahumada says. “We need a system that can seamlessly integrate this into information that is useful. That’s where our partnership with HP became really important.”

Data Can Change the World

“We’ve learned a lot of things [in the past 25 years],” Ahumada adds. “As a conservation organization, as a nonprofit, if we really want to try to change the world, a lot of the traditional approach is you work with governments and try to influence change through various interventions. But if you want to do it in a data-driven way, you really need to collect a lot of data. It is really important for us to partner with companies or institutions that have this expertise, and this is starting to happen now with different organizations in the nonprofit world.”

So far, the data that CI’s TEAM and HP Earth Insights have analyzed doesn’t paint a pretty picture. Of the 275 species TEAM monitors, 60 of them (22 percent) are either significantly decreasing or likely decreasing compared with baseline levels. For instance, the Western Gorilla, a critically endangered species in the Republic of Congo, appears to have declined 10 percent from its 2009 baseline.