The NSA Can ‘Collect-it-All,’ But What Will It Do With Our Data Next?
In the summer of 2008, Gen. Keith Alexander, the recently resigned director of the National Security Agency, posed an audacious question to intelligence analysts at the Menwith Hill eavesdropping station in North Yorkshire, in the United Kingdom: “Why can’t we collect all the signals all the time?”
Since Edward Snowden’s revelations began, that quote has gone a long way to color the NSA’s seemingly insatiable appetite for data. Former administration officials who worked for Alexander have similarly described the cowboy-like regime of intelligence-gathering that defined his tenure, calling his strategy “the same as Google’s: I need to get all of the data.”
Amusingly, it wasn’t until a recent interview with comedian and former Daily Show correspondent John Oliver that the general got grilled on this point. “You do understand that ‘collect everything’ is also the motto of a hoarder,” he told Alexander during the premiere of his new talk show. “That’s the fundamental principle which ends up with someone living alongside 1,500 copies of newspapers from the 1950s and six mummified cats.”
Alexander smoothly responded that his “collect-it-all” quip was taken out of context, and only relevant to “specific problems”—in this particular case, the dramatic rise of U.S. troop casualties in Iraq. But additional Snowden documents released in Glenn Greenwald’s new book, No Place to Hide, seem to further illustrate the “collect-it-all” mentality as a recurring mantra within the NSA and its allied intelligence agencies. Indeed, much to Oliver’s humorous point, some of the material reads like dispatches from a Hoarders Anonymous convention.
Chief among them is a graphic presented at a secret meeting of the “Five Eyes” surveillance alliance in 2011 describing a “New Collection Posture” with incredibly explicit goals: “Collect it All; Process it All; Exploit it All; Partner it All; Sniff it All,” and, most chillingly, “Know it All.” Another document, from British spy agency Government Communications Headquarters, names a program that intercepts satellite communications, dubbed “Asphalt,” describing it as a “‘Collect it All’ proof-of-concept system.” A 2009 memo from the NSA operations center in Misawa, Japan, similarly touts its newly improved interception capabilities as “bringing our enterprise one step closer to collecting it all.”
“They don’t need a reason to collect information. That’s a mischaracterization of what the NSA does,” said Greenwald on Tuesday, during an event for his new book at Cooper Union in New York. “The reason they are collecting things is because those are communications that exist on the Earth, and because they can.”
A point commonly made by NSA critics is that these dragnets collect not enough signal and too much noise. Several internal documents give that credence, including one that admits the NSA “collects far more content than is routinely useful to analysts.” A top-secret chart in Greenwald’s book displaying “Current Volumes and Limits” for data storage shows that the agency collected upwards of 20 billion “communications events” per day in 2012, the vast majority of which were stored in various databases. In December of the same year, a program called “Shelltrumpet” processed its 1 trillionth metadata record; almost half that amount was processed in 2012 alone.
Such statistics seem to be cause for both celebration and headache within the NSA. Another classified slide, titled “The Challenge,” states that “Collection is outpacing our ability to ingest, process, and store the ‘norms’ to which we have been accustomed.” This overcollection is such a widely acknowledged problem that the agency has a separate line in its budget devoted to “coping with information overload.”
Often left out of the discussion, however, is how intelligence agencies like the NSA might one day turn all this noise back into signal.
In many ways, it’s already happening. Government contractors like Palantir, a company valued at $9 billion and backed by an alphabet soup of three-letter agencies, have built the success of its pattern-analysis software upon the growing need to make sense of the U.S. government’s ever-expanding Library of Babel. These systems work by adding metadata tags to communications as they pass through digital sensors at various choke points throughout the Internet and then running complex algorithms to find patterns in the noise. Once patterns are established, an intelligence analyst need only input the “selectors,” or search terms, into a user interface like the NSA’s “X-Keyscore” to find whatever it is they want to find.
German hacker Thomas Dullien, aka Halvar Flake, has a name for this unholy fusion of Big Data and Big Brother: “Full-packet-capture society.” He describes a possible future where the rapidly falling cost of electronic storage, pattern-detecting algorithms, and ubiquitous networks of sensors conspire to create a world “where every word ever spoken and every action ever taken is recorded somewhere.”
That may seem like far-off dystopian science fiction, but the spirit of “collect-it-all” is already visible outside the NSA as systems like automated license-plate readers and facial-recognition spring up around the world. California-based Vigilant Solutions maintains the largest private database of license-plate images, holding more than 550 million records of vehicle movements and grants unfettered access to police departments around the country. And the pilot program for the FBI’s national facial-recognition database is expected to hold 52 million face images by next year—including photos of individuals not suspected of a crime.
The NSA and its allies are staunch defenders of these “haystacks,” even though multiple studies concluded the database containing millions of Americans’ phone records played little or no role in preventing terrorist attacks. They’ve countered that it’s foolish to assume all terrorists hang out in one isolated section of the Internet, therefore mass-collection becomes a necessary obsession to find that ever-elusive needle.
But the recent passage of the amended USA Freedom Act through the House seems to suggest there is an alternative. The NSA reform bill, in its current form, would leave the agency’s hugely controversial database of Americans’ phone records in the hands of telecommunications companies, and would require the NSA to prove that there is “reasonable suspicion” to collect individual records. It’s a step down from a probable-cause warrant, but still starkly contrasts a rival bill written by NSA defenders that would allow the agency unfettered access without a judge’s approval.
It would be prudent to reach some kind of consensus soon, because when you collect-it-all, the next step is to automate-it-all. After that, who knows how many innocent straws of hay will start to look like needles under the gaze of unseen algorithms.