The Data Hydra
It’s Not Just Cambridge Analytica, Facebook, & Trump—The Whole Web Is Stalking You
The internet was never designed for privacy, and between consumer profiling and government surveillance, there’s been little incentive to make it any more private.
After the news broke that voter-profiling company Cambridge Analytica harvested the personal data of 50 million Facebook users to help market Donald Trump to them, Facebook has insisted it did nothing wrong and has pushed back on news reports claiming its data was breached.
Facebook is right: There was no breach. That’s the problem.
Cambridge Analytica is no outlier. The horror is not that Cambridge Analytica demographically and psychologically profiled 50 million Facebook users, but that everyone is doing it. Cambridge Analytica did not exploit a loophole. Rather, they used Facebook’s big data the way it was intended. They just marketed a candidate instead of a product. (These days, there’s not that much of a difference.)
That’s the problem with big data: It doesn’t come with signs, let alone rules, declaring CAN ONLY BE USED FOR GOOD. Once the data is out there, people will find a way to use it for whatever they want: marketing, surveillance, or propaganda.
Or, as the managing director of Cambridge Analytica Political Global put it: “We just put information into the bloodstream of the internet, and then, and then watch it grow, give it a little push every now and again,” he said. “Like a remote control. It has to happen without anyone thinking, ‘that’s propaganda’, because the moment you think ‘that’s propaganda’, the next question is, ‘who’s put that out?’”
Facebook admits that the data collection itself, obtained in part through an personality profile app called “thisisyourdigitallife” created by Cambridge psychology professor Aleksandr Kogan, was not a violation of its terms. Used to harvest names, locations, friends lists, Likes, and more, the data was well-suited for helping to identify and motivate potential voters. Last Friday, Facebook wrote:
Approximately 270,000 people downloaded the app. In so doing, they gave their consent for Kogan to access information such as the city they set on their profile, or content they had liked, as well as more limited information about friends who had their privacy settings set to allow it.
Although Kogan gained access to this information in a legitimate way and through the proper channels that governed all developers on Facebook at that time, he did not subsequently abide by our rules. By passing information on to a third party, including SCL/Cambridge Analytica and Christopher Wylie of Eunoia Technologies, he violated our platform policies.
There is an absurdity here: Facebook admits it allows third-party apps to collect intimate data on its users. The only restriction is that the app creators can’t then reshare the data. In other words, the restrictions are on things that Facebook cannot possibly police proactively.
This, too, has long been Facebook’s strategy. In 2008, Facebook rolled out Facebook Beacon, a program to put invisible “web bugs” on pages across the internet. Beacon tracked Facebook users across the internet and posted their activity to their Facebook walls. Buy some baby clothes on Amazon, and it would announce the fact on Facebook, without your permission. There was an outcry, and eventually Facebook shut Beacon down.
But it didn’t give up. By splashing its Like button across the internet, by placing third-party cookies and Facebook logins on other sites, and by acquiring user data directly from third parties, Facebook continued devouring personal information. That personal information is what made Facebook a treasure trove for advertisers and marketers—and, as we now know, for Cambridge Analytica, the political microtargeting firm that helped get Donald Trump elected. An enormous industry of hundreds of firms sprung up around demographic profiling, marketing, and real-time bidding in order to sell online advertising to the right user at the right time. Facebook is the highest-profile broker for this kind of advertising, but far from the only one.
That’s why questions of illegality and terms-of-service violations are misleading. Even in the absence of violations, this sort of data collection is ubiquitous and omnivorous. Facebook would likely prefer the discussion to focus on questions of law, since United States data-sharing laws are quite lax, relying on “self-policing” by industry groups like the Network Advertising Initiative as a way to fend off actual legislation or regulation. European strategies like the “right to be forgotten” focus only on the publication of personal data rather than its initial collection—the very essence of closing the stable door after the horse has bolted.
The problem, as law professor Frank Pasquale puts it in The Black Box Society: The Secret Algorithms That Control Money and Information, is “runaway data.” We are accustomed to thinking that once we share data with a particular entity—be it Google, Facebook, Amazon, or any one of thousands of shadowy data marketing firms—our data is somehow siloed there. Nothing could be further from the truth. In fact, firms are constantly sharing, selling, and coalescing data piecemeal in order to construct increasingly elaborate profiles of customers and citizens, and this data is frequently available in one form or another to anyone with the money to buy it. Entire companies like Interclick have been bought specifically for the hundreds of millions of demographic profiles they had accumulated.
If Cambridge Analytica hadn’t been able to get what it needed through Facebook, it could have gone to any number of other data brokers to get it, and then cross-referenced it to target Facebook users. If you’re a likely Trump voter, Facebook is not the only company with the evidence to prove it. Your credit card receipts, website habits, and demographic profile are frequently just as good, and there is no shortage of companies offering your data, without your permission.
Take Acxiom, a company which offers “Identity Resolution & People-Based Marketing.” In a series of articles in The New York Times, Natasha Singer explored how this veteran marketing technology company (founded in 1969) has profiled 500 million users, 10 times the 50 million that Facebook offered to Cambridge Analytica, and sells these “data products” in order to help marketers target customers based on interest, race, gender, political alignment, and more. WPP and GroupM’s “digital media platform” Xaxis has also claimed 500 million consumer profiles. Other marketing companies, like Qualia, track users across platforms and devices as they browse the web. There’s no sign-up or opt-in involved. These companies simply cyberstalk users en masse.
Even if Facebook were to seal off its data from the rest of the world, Cambridge Analytica could go to Acxiom, or any other company like it, to find the right voters and then locate them by name alone on Facebook (or elsewhere). How many of these companies would be able to tell the difference between an ordinary client using its data versus Cambridge Analytica, or even Russia? Once you’ve got a user’s data without their permission, most everything is on the table.
The only surprise is in how long it took for this data hydra to create havoc. Given the close margin of the election, it’s quite possible that Cambridge Analytica’s work (like so many other small factors) was enough to shift the election in Trump’s favor. The keys have been hanging from the door lock for over a decade, and few cared until now.
Facebook’s decision to suspend Cambridge Analytica now is little comfort. Notably, Facebook said nothing about preventing new companies from doing the same sorts of data harvesting and marketing. And even if Facebook did decide to sacrifice the revenue and somehow cut off all such activity, a peek inside Amazon, Google, Oracle, or Acxiom will reveal petabytes of similar personal data, ready to be shared, studied, and sold.
The internet was never designed for privacy, and between consumer profiling and government surveillance, there’s been little incentive to make it any more private. Cambridge Analytica’s work is not an aberration; it is an inevitability.
In 2013, I wrote of the growing threat not of Big Brother, but Big Salesman: For internet marketing companies, you are what you click. Big Salesman appears more innocuous when he’s marketing shoes and cars. Now that he’s marketing Donald Trump, we are beginning to wake up to his danger. With all our data out there, being gathered and sold by hundreds of companies, there will always be a Cambridge Analytica, a Steve Bannon, or a Vladimir Putin ready to make use of it.
David Auerbach is a writer and software engineer. His book Bitwise: A Life in Code, is forthcoming from Pantheon in August.