A Geek’s Guide to the NSA Scandal: What You May Not Know About Data Collection

The domestic surveillance scandals deserve serious discussion, but unfortunately there’s a lot of misinterpretation and misinformation flowing through the mediasphere. I believe it’s driven by misunderstanding of technical issues as well as a phenomenon known as confirmation bias. The result smacks of paranoid fear-mongering inside a cloud of unknowing instead of a clear-eyed search for the truth.

articles/2013/06/20/a-geek-s-guide-to-the-nsa-scandal-what-you-don-t-get-about-data-collection/130620-nsa-guide-johnson-tease_vmpkcp

First, a confession: I’m what’s known as a “nerd.” Possibly also a “geek.” In addition to a musical background, I’ve been programming computers for more than 20 years, before beginning my LittleGreenFootballs.com blog. So from a technical angle, there were some things about the media coverage of the NSA spying case that set off warning bells for me from the start.

This was the opening paragraph of the Guardian’s first report on the PRISM PowerPoint slides they obtained from Edward Snowden (a similar claim was made by the Washington Post):

The National Security Agency has obtained direct access to the systems of Google, Facebook, Apple and other US internet giants, according to a top secret document obtained by the Guardian.

Now, that’s how to get a nerd’s attention! “Direct access” to the servers of every top Internet company, reading all the email and Internet communications of every American citizen? That would be quite a bombshell indeed—not to mention a prodigious technical feat.

But almost immediately I realized that “direct access” could (and probably did) mean something much more limited than the indiscriminate snooping suggested. Here’s the actual wording of the slide in question:

PRISM – Collection directly from the servers of these U.S. Service Providers…

Without getting too technical, in Web technology a very common technique when files need to be exchanged between users is to set up something known as a “sandboxed FTP directory.” This is a special directory that can be accessed only with a username and password; it does not allow the user to see anything else on the server, only the contents of that one directory. There are other ways to achieve this same end, and this kind of sandboxed system, which would allow the feds to log in and pick up whatever was placed there for them, seemed much more believable than the NSA having free access to rummage around on the tech companies’ servers.

Shortly after the story broke, all of the companies named by The Washington Post and The Guardian came out with very strongly worded statements contradicting this “direct access” claim. And Google shared more details that confirmed my suspicions:

Chris Gaither, a Google spokesman, said that when the company receives court orders to provide information to the government, it usually does so with secure FTP, a method of sending encrypted files over the Internet.

In other words, Google “pushes” information for the government rather than allow the government to “pull” information directly from Google’s system, Gaither said.

So there it is. The initial overheated claim seems to have been based on a misreading of one PowerPoint slide, and the actual data collection to which it refers is much more limited. (Unless you believe all the major tech companies are simply lying in unison and risking enormous customer backlash. I don’t.)

This is the kind of mistake someone who is not very knowledgeable about Web technologies could easily make. But the question should be asked: why didn’t news organizations do some research first, to see if there was another explanation for the “direct collection” phrase before pushing such an inflammatory and misleading claim?

We all know that dramatic headlines and overheated claims can bring in lots of page views, and page views sell advertising, but even if we assume that the authors’ intentions were good to begin with and not purely commercial, there’s another possible explanation for the continuing exaggerations and misrepresentations—a phenomenon known to psychologists as “confirmation bias.”

Confirmation bias is a type of cognitive short circuit that leads a person to seek out and assign more importance to information that confirms their ideological (or personal) biases and to ignore information that doesn’t. It can also lead people to falsely interpret information and jump to conclusions not based on evidence. Scientists design experiments specifically to avoid confirmation bias, by using double-blind testing and other techniques.

Another story about the NSA at CNet, by Declan McCullagh, seems to be a perfect example of confirmation bias. On June 15, McCullagh posted a piece with the title, “NSA Admits Listening to U.S. Phone Calls Without Warrants.”

But meanwhile back in reality, the NSA didn’t “admit” anything of the sort. McCullagh’s claim that they did was based on something Rep. Jerrold Nadler said while questioning FBI director Richard Mueller:

We heard precisely the opposite at the briefing the other day. We heard precisely that you could get the specific information from that telephone simply based on an analyst deciding that and you didn’t need a new warrant.

The CNet article makes a gigantic leap to an unwarranted conclusion from Rep. Nadler’s statement. The claim that the NSA “admitted” something seems to have been completely invented; Nadler never even mentions the NSA. It’s hard to see how a second-hand statement about a briefing constitutes any kind of “admission” at all.

And the very next day, Nadler made another statement clarifying the issue:

“I am pleased that the administration has reiterated that, as I have always believed, the NSA cannot listen to the content of Americans’ phone calls without a specific warrant.”

That CNet headline was entirely inaccurate, but it was immediately seized upon and spread throughout the blogosphere, even getting a front-page link at Drudge Report—which is good for millions of those precious page views. But the article itself contained the seeds of its own debunking. How did so many people fail to see that the actual article did not support the wild claim made by the headline?

I strongly suspect we’re once again seeing a massive case of confirmation bias at work. McCullagh is a social libertarian and supporter of Ron Paul (here’s a video of a talk he gave at the New Hampshire Liberty Forum, a Ron Paul advocacy group). Did his libertarian ideology lead him to jump to conclusions, even though his own piece didn’t support them? Did the bloggers and news sites that uncritically circulated the story see only what they wanted to see, because it confirmed their ideological biases?

In the long run it’s a good thing that people are talking about these serious issues, and gaining some understanding of the kinds of privacy problems that inevitably arise in our age of technology. But some of the reporting, possibly driven by confirmation bias, verges on paranoid fear-mongering—and that’s never a good way to start a national conversation.

A Geek’s Guide to the NSA Scandal: What You May Not Know About Data Collection

If you only read the headlines, the government snooping scandal sounds worse than it is.

Charles Johnson