After Ali Swanson, an ecology researcher from the University of Minnesota, set up 225 cameras over 400 square miles of Tanzania’s Serengeti National Park, she was hit by the curse of Big Data: how do you make sense of the head-spinning contents of more than a million photographs? Her cameras, triggered by sensors that measure heat and motion, were capturing an enormous number of images of prolific animals like wildebeest, stampeding through the park. Swanson was initially most interested in carnivore behavior, so she desperately needed help sifting through this tsunami of snapshots.
So the research team applied to join Zooniverse, part of a larger suite of projects that allow the general public to sift through data—like searching for images of star clusters in the Andromeda Galaxy, for example—and help researchers with the taxonomy. A visitor to the resulting website, SnapshotSerengeti.org, is instructed in how to classify an animal and can then click through images, one at a time, labeling the contents of each one. It reveals an intimate look at a corner of Africa: a photo might clearly show a zebra, or it might be a trickier image, no more than an animal’s rump and tail, with grassland and sky in the background.
Perusing the site is like taking a virtual safari, an edition of National Geographic with no editing. Some photos offer a lovely glimpse of wildlife—a group of gazelle, with long curved, elegant horns, standing in a sunny field of grass—and others are humorously candid. The frame of one photograph is filled three quarters of the way with an extreme close-up of a zebra’s striped chest. In a nighttime shot, is that an elephant’s butt, bleached white by the camera’s flash?
The researchers were astonished by the public’s response: in the first week alone, they received 3 million classifications. “We were excited, but also a little flummoxed, because we were hoping that this would be something that would continue a little longer so people could enjoy it more,” says Margaret Kosmala, one of the researchers on the team. They have already classified two years’ worth of data since the project went live in December, and in the near future, images from another half year will go up. In the meantime, the website is still live, and visitors can still classify photos. A system that allows multiple classifications per photo helps ensure accuracy—a sort of crowdsourced peer review.
Using the public to sift through (or gather) data is known as “citizen science,” and in situations like Snapshot Serengeti it makes perfect sense—human eyes are better at identifying what’s in these photographs than are computers. Kosmala says that at this point in the process, they are still verifying the classifications that the public has made, and mentions that “there is still some skepticism” among scientists about using the public to help categorize data—the researchers need to show that the system works accurately. Their goal is to use the photographs to study the bigger picture of the interactions among all animals in the Serengeti, not just carnivores. Wildebeest have a known migration cycle, she notes, but how do they move around on a more microlocal level, using the “bits and pieces of the landscape” of the park?
One of the more beguiling insights from the data, Swanson says, lies in the interplay between lions and cheetahs. “Cheetahs seem to be showing up just as often inside lion territories as they are outside,” she says. “In fact, cheetahs are more often found inside lion territories.” This shows that, contrary to popular belief, cheetahs don’t shy away from areas where they prefer to be in order to steer clear of lions. This raises “some cool questions about how carnivores co-exist with each other and at what scale they’re able to share the habitat.” None of this would be possible, Swanson adds, without citizen scientists.