Inside the Deepfake ‘Arms Race’

By late 2018, Ali Bongo, the president of Gabon in West Africa, hadn’t appeared in public for many months after reportedly suffering a stroke.

So it came as a surprise when a seemingly official video appeared online on New Year’s Day depicting Bongo, looking a bit disoriented, addressing the Gabonese people.

Bongo’s rivals declared the video to be a “deepfake”—a sophisticated digital forgery. Bongo was dead or incapacitated, the opposition declared. On Jan. 7, army officers launched an abortive coup.

Bongo wasn’t dead. And the video wasn’t a deepfake, just a weirdly-staged recording of an unwell man struggling to appear healthier than he apparently was. But the coup plotters were right about one thing. It’s entirely possible to fake a convincing video of a world leader... or anyone else, for that matter.

Deepfakes are a fairly new feature of social media. And they’ve already contributed to violence and instability on a national scale. Experts say the problem could get worse before it gets better.

The first deepfakes appeared in late 2017 on Reddit. An anonymous user calling themselves “deepfakes”—a portmanteau of artificial-intelligence “deep learning” and “fakes”—imposed celebrities’ faces on pornography.

Reddit eventually banned deepfakes, but the cat was out of the bag. “A wave of copycats replaced him on other platforms,” Tom Van de Weghe, a Stanford University researcher, wrote in a May essay.

A deepfake video, still image, or audio recording is the product of a clever bit of coding called a “generative adversarial network,” or GAN. A GAN has two components: a discriminator and a generator. The discriminator is trying to tell fake media from real media. The generator is trying to fool the discriminator with increasingly realistic-seeming fakes.

The two components go back and forth at the speed of computer code. “The generator goes first and generates an image,” Matt Guzdial, an assistant professor of computer science at the University of Alberta, told The Daily Beast. “This is tricky because initially the generator is going to produce random noise.”

Now it’s the discriminator’s turn. It’s sitting on a trove of authentic media plus the initial fakes from the generator. “It will guess which is which,” Guzdial explained. “After guessing, it will be told which ones are real ones.” Then the process starts all over again.

The trick for the discriminator is getting over the hump of just coughing up random noise. If the generator can break through the noise barrier and grok what the discriminator is seeing or hearing in its catalogue of authentic media, then the generator can really start learning from the back-and-forth.

After potentially hundreds of thousands of iterations, one of two things happens. “Eventually the generator starts generating things that look like real images,” Guzdial said. “Or it doesn’t work.” The same process applies to video, stills and audio.

More and more, it does work. Type “deepfake” into the search bar on Facebook and you can watch Tom Cruise starring in a Marvel superhero movie or Barack Obama ranting about, well, a Marvel superhero movie.

It’s not hard to imagine political parties, intelligence agencies, militant groups and online trolls deploying deepfakes to discredit, frame or embarrass their rivals. The risk is such that computer scientists already are scrambling to devise deepfake countermeasures. “There’s an arms race,” Mark Riedl, an associate professor of computer science at Georgia Tech, told The Daily Beast.

That said, deploying a GAN is easier said than done. The underlying code isn’t hard to find. But you need more than code. For starters, you need lots and lots of authentic media as grist for the generator. For faking a photo or video of a person, you need need plenty of clear images of that person’s face. To fake that person’s voice, you’d need lots of clean recordings of them really speaking.

Social media makes it easier to find good media as a starting point, Van de Weghe pointed out. But gathering all that raw material still takes time and effort.

Of course, in this hypothetical scenario you’re a hacker or a shady political operative or a committed troll, so you’re willing to put in the hours. You’ve gathered plenty of authentic media as a starting point, fired up your GAN and, after a few noisy failures, produced a convincing deepfake.

Now you need to do it again. And again. “It is likely that a malevolent entity would need more than a single image, audio or video to effectively sway opinion at scale,” Polo Chau, who is also a Georgia Tech computer scientist, told The Daily Beast. After all, any one deepfake is likely to disappear in the churn of social media.

So you’ve produced a bunch of decent deepfakes all targeting the same person in the same way. It’s enough to really make a dent in your subject’s reputation. Assuming, that is, no one flags your fakes for what they really are.

Sure, GANs are getting better all the time. But for now it’s still easier to detect a fake than it is to produce a convincing one. Today’s GANs are good at faces, Riedl explained. But they get sloppy around complex, moving materials. Look close at the subject’s hair in deepfake video. You should be able to spot telltale distortions.

It’s possible to automate this scrutiny. Social-media companies and users could deploy discriminators to sift through media on a network, looking for the pixelation and other digital fingerprints of GAN-produced fake. In September Google released, like targets at a shooting range, a trove of 3,000 old deepfakes—all in order to boost efforts to identify newer fakes.

Plus, there are methods of countering deepfakes that don’t solely rely on code. Purveyors of deepfakes need social media accounts and unscrupulous media outlets to help spread the malicious content, Chau explained.

And that exposes the purveyor of the deepfake to social-media analysis, Riedl said. “You can analyze that and find out where these things originate from, look at the network around this person and find out if they’re an artificial entity or linked to known groups.”

“You can counter-program against that,” Reidl added. Methods could include blocking or reporting the account pushing the deepfake. “That’s a very promising way of not dealing with the tech directly.”

Ultimately, experts said, the best defense against GANs and the deepfakes they produce is an educated and skeptical public that can view media in an informed context, considering its source, its proponents and detractors and its potential for weaponization.

Obama chattering away about some Marvel villain? Certainly useful to fringe media outlets eager to portray the former president as silly And thus probably fake. A smart social-media user should know that.

Skepticism is the key. But in a hyper-partisan media environment, where everyone is grasping for any confirmation of their existing biases, skepticism could be in short supply.

That, more than the GANs and deepfakes themselves, worries some experts. “The reason deepfakes are such a threat is not that the tech is so good,” Riedl said, “it’s that people want to be fooled.”

Inside the Deepfake ‘Arms Race’

Can countermeasures neutralize the coming wave of high-tech disinformation?

David Axe