Facebook was warned five years ago that the “reverse-lookup” feature in its search engine could be used to harvest names, profiles, and phone numbers for virtually all its users. But the company ignored the red flags until last week, after it happened.
In prepared testimony to Congress released Monday, Mark Zuckerberg acknowledged that malefactors had used the reverse-lookup “to link people’s public Facebook information to a phone number,” he wrote (PDF). “When we found out about the abuse, we shut this feature down.” He said that Facebook only discovered the incidents two weeks ago.
Zuckerberg is set to testify at a joint hearing before the Senate’s Judiciary and Commerce committees on Tuesday, and then return to Capitol Hill on Wednesday to appear before the House Energy and Commerce Committee. This will be the first time Facebook’s billionaire founder and CEO has ever appeared before Congress. Last fall the company’s vice president and general counsel Colin Stretch appeared at the hearings probing Russia’s election interference campaign.
The hearings are a response to last month’s revelations that Cambridge Analytica, a U.K.-based consulting firm that worked for the Trump campaign, harvested data on as many as 87 million Facebook users without their knowledge.
Facebook revealed the separate reverse-lookup data spill while responding to the Cambridge Analytica controversy.
The issue was that Facebook allowed users to find anyone on the site by entering either their phone number or email address. In 2010, computer science researchers in Greece showed how spammers could use that feature to validate address lists and “craft personalized phishing emails that are far more efficient than traditional techniques by using personal information publicly available in social networks” (PDF).
But Zuckerberg’s written testimony reveals for the first time that it was phone number lookups that were used in the large scale scraping. That’s a more potent weapon for bulk harvesting, because a data miner can programatically cycle through every possible phone number to get a complete corpus. With some exceptions—custom privacy settings or accounts with no phone number attached—sequential mining would yield every Facebook profile.
Facebook didn’t respond to inquiries for this story.
Though Facebook is professing surprise at the data spill, in 2013 security researcher Bennett Haselton warned Facebook publicly and privately of this exact scenario.
“You could use this technique to build up a database of phone numbers and associated accounts without targeting any specific phone number or account,” Haselton wrote in a prescient post to the technology website Slashdot. “Not only would you know the names associated with each of the numbers, you could associate the phone number with anything else that was discoverable from the person’s Facebook profile—which usually includes their location, their interests, and the names of their other friends.
“It would only have to be done once to put the users’ data permanently in the hands of the attackers, with Facebook unable to put the cat back into the bag,” he added.
Facebook’s primary countermeasure against bulk profile harvesting was rate-limiting, i.e., blocking rapid-fire search queries originating from the same Internet Protocol, or IP, address. The unidentified perpetrators bypassed that protection by cycling “through many thousands, or hundreds of thousands, of IP addresses to evade rate limiting,” Zuckerberg said last week.
In an interview with The Daily Beast, Haselton said Facebook never responded to his reports. He says removing the reverse-lookup search was the right move, even if it came five years late. “This is not functionality they had to leave in.”
Facebook removed the email and phone search capabilities entirely last Wednesday. “Given the scale and sophistication of the activity we’ve seen, we believe most people on Facebook could have had their public profile scraped in this way,” wrote Facebook chief technology officer Mike Schroepfer in a blog post.
Overall, Facebook’s response to bad news has been more spin than win. Zuckerberg initially scoffed at the notion that Facebook played a significant role in Russia’s campaigning. When the company finally found hundreds of fake accounts created by Russia’s troll farm it refused to publicly identify them, instead publishing statistics that seemed hand-picked to minimize the Kremlin’s reach—just $100,000 in ad spending, a mere 470 fake accounts. One oft-heard talking point noted “the majority of the Russian ad spend happened AFTER the election,” a stat that wouldn’t have worked if Facebook had cut off the Kremlin seven months after the election instead of 10. Eventually, last October, Facebook reluctantly revealed the number that mattered: the number of Americans reached by the Kremlin’s Facebook campaign—126 million.
There are signs the company is taking a more forthright approach now—when it booted another batch of Russian troll accounts last month, it identified some of them by name, and even showed screenshots of some content. The most promising indicator is Zuckerberg’s voluntary appearance on Capitol Hill, where spin has a legal limit.