How to detect deepfakes in live video calls? This is what the experts say to do

The next time you get a Zoom call, you might want to ask the person you’re talking to to stick their finger up the side of their nose. Or maybe go full profile to the camera for a minute.

Those are just a few of the methods that experts have recommended as ways to ensure you’re seeing a real image of the person you’re talking to and not a deepfake impersonation.

It sounds like a strange precaution, but we live in strange times.

Last month, a senior executive at cryptocurrency exchange Binance said that scammers had used a sophisticated fake “hologram” of him to swindle various cryptocurrency projects. Patrick Hillmann, director of communications at Binance, says that criminals had used the deepfake to impersonate him on Zoom calls. (Hillmann has provided no evidence to support his claim, and some experts are skeptical. However, security researchers say such incidents are now plausible.) In July, the FBI warned that people could use deepfakes in video job interviews. conference software A month earlier, several European mayors said they were initially fooled by a fake video call purporting to be with Ukrainian President Volodymyr Zelensky. Meanwhile, a startup called Metaphysic that develops deepfake software made it to the final of “America’s Got Talent,” creating remarkably good deepfakes of Simon Cowell and the other celebrity judges, transforming other singers into celebrities in real time, just before the eyes of the audience.

Deepfakes are extremely convincing fake images and videos created through the use of artificial intelligence. It used to take a lot of pictures of someone, a lot of time, and a good degree of coding skill and special effects knowledge to create a believable deepfake. And even once created, the AI ​​model couldn’t run fast enough to produce a real-time deepfake on a live video stream.

That is no longer the case, as highlighted by both the Binance story and the “America’s Got Talent” Metaphysics act. In fact, it’s getting easier for people to use deepfake software to impersonate others on live video streams. Software that allows someone to do this is now readily available, free of charge, and requires relatively little technical skill to use. And as the Binance story also shows, this opens up the possibility of all kinds of fraud and political disinformation.

“I’m amazed at how fast live deepfakes have come and how good they are,” says Hany Farid, a computer scientist at the University of California, Berkeley, an expert in video analytics and authentication. He says there are at least three different open source programs that allow people to create live deepfakes.

Farid is among those who fear live deepfakes could fuel fraud. “This is going to be like phishing scams on steroids,” he says.

The “pencil test” and other tricks to catch an AI impostor

Fortunately, experts say there are still a number of techniques a person can use to have reasonable assurance that they’re not communicating with a phishing spoof. One of the most reliable is to simply ask a person to turn around so that the camera captures them in full profile. Deepfakes struggle with profiling for various reasons. For most people, there are not enough profile images available to train a deep dummy model to reliably reproduce the angle. And while there are ways to use computer software to estimate a profile view from a frontal image, using this software adds complexity to the deepfake creation process.

Deepfake software also uses “anchor points” on a person’s face to correctly place the deepfake “mask” on top of it. Rotating 90 degrees removes half of the anchor points, which often causes the software to warp, blur, or distort the profile image in weird ways that are very noticeable.

Yisroel Mirsky, a researcher who heads the offensive artificial intelligence lab at Israel’s Ben-Gurion University, experimented with a number of other methods for detecting live deepfakes that he likened to the CAPTCHA system used by many websites to detect hacking bots. software (you know, the one that asks you to choose all the images of traffic lights in a photo divided into squares). His techniques include asking people on a video call to pick up a random object and move it across their face, bounce an object, lift and fold their shirt, stroke their hair, or mask part of their face with their hand. . . In each case, either the deepfake will either fail to represent the object passed in front of the face or the method will seriously distort the facial image. For audio fakes, Mirsky suggests asking the person to whistle, or try to speak with an unusual accent, or hum or sing a random tune.

Image showing screenshots of researcher Yisroel Mirsky performing a series of deepfake detection methods.
Here, Yisroel Mirsky, who heads the offensive artificial intelligence lab at Israel’s Ben-Gurion University, demonstrates a series of simple methods for detecting deep fakes live.

Image courtesy of Yisroel Mirsky

“All existing deepfake technologies today follow a very similar protocol,” says Mirsky. “They’re trained on lots and lots of data and that data has to have a particular pattern that you’re teaching the model.” Most AI software is taught to reliably mimic a person’s face viewed from the front and cannot handle oblique angles or objects that occlude the face well.

Farid, meanwhile, has shown that another way to detect potential deepfakes is to use a simple software program that causes the other person’s computer screen to flicker in a certain pattern or project a pattern of light onto the person’s face. who uses the computer. Either the deepfake will not be able to transfer the lighting effect to the spoof or it will be too slow to do so. Similar detection could be possible simply by asking someone to use another light source, such as a smartphone’s flashlight, to illuminate her face from a different angle, says Farid.

To realistically impersonate someone doing something unusual, Mirsky says the AI ​​software must have seen thousands of examples of people doing just that. But collecting a data set like that is hard. And even if you could train the AI ​​to reliably impersonate someone doing one of these challenging tasks, like picking up a pencil and waving it in front of your face, the deepfake is likely to fail if you ask the person to wear a very different one. type of object, such as a cup. And attackers using deepfakes are unlikely to have been able to train a deepfake to pass multiple challenges, such as the pencil test and the profile test. Each different task, says Mirsky, increases the complexity of the training the AI ​​requires. “You’re limited in what you want deepfake software to improve on,” he says.

Deepfakes are getting better all the time

For now, few security experts are suggesting that people will need to use these CAPTCHA-like challenges for every Zoom meeting they take. But Mirsky and Farid said it would be wise for people to use them in high-risk situations, such as a call between political leaders or a meeting that could result in a high-value financial transaction. And both Farid and Mirsky urged people to be on the lookout for other potential red flags, such as audio calls from unknown numbers or people behaving strangely or making unusual requests (Would President Biden really want to be bought a car? bunch of Apple gift cards?).

Farid says that for very important calls, people can use some kind of simple two-factor authentication, like sending a text message to a mobile phone number that you know is the right one for that person, asking if they’re on a video call on this moment. with you.

The researchers also stressed that deepfakes are getting better all the time, and there’s no guarantee that it won’t be much easier for them to evade any particular challenge, or even combinations of them, in the future.

That’s also why many researchers are trying to approach the live deepfake issue from the opposite perspective: creating some sort of digital signature or watermark that proves a video call is authentic, rather than trying to discover a deep fake.

One group that could work on a protocol to verify live video calls is the Coalition for Content Provenance and Authentication (C2PA), a foundation dedicated to digital media authentication standards that is backed by the likes of Microsoft, Adobe, Sony and Twitter. . “I think C2PA should pick this up because they have created specifications for recorded video and extending it for live video is a natural thing to do,” says Farid. But Farid admits that trying to authenticate data being transmitted in real time is not an easy technological challenge. “I don’t immediately see how to do it, but it will be interesting to think about,” he says.

In the meantime, remind guests on your next Zoom call to bring a pencil to the meeting.

Sign up for the characteristics of fortune email list so you don’t miss out on our biggest features, exclusive interviews and investigations.

Leave a Comment