A team of researchers recently developed an AI that can transfer your real-time facial expressions, eye movements, and poses to a portrait – making it appear as though the person in the image is actually talking and moving. It’s disturbingly convincing, and almost constantly improving.
The project’s white paper calls it the “first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze.”
According to the researchers, there’s nothing else quite like this out there. It combines several technologies – most of which were either pioneered or perfected by this very research team.
In order to solve the eye gaze problem, the team previously developed FaceVR:
The original work done on Face2Face provided the framework for a large portion of HeadOn’s capability, but what’s changed is the transfer of torso and head motion to go along with the original’s facial expression translation.
It’s incredibly creepy to see it in action, you can almost instantly imagine someone doing something awful with this. If hackers can wreak havoc on a person’s life with little more than access to their Twitter account, imagine what an identity thief could do with something that gives them the ability to appear as you in a video call. Yikes!
But, any technology can be perverted for evil, and as long as the developers make the output detectable in some way it’ll at least be possible to protect against the AI’s misuse. And, it bears mention, the positive applications for this AI are numerous. As the researchers put it:
Even though current facial reenactment results are impressive, they are still fundamentally limited in the type of manipulations they enable. For instance, these approaches are only able to modify facial expressions, whereas the rigid pose of the head, including its orientation, remains unchanged and does not follow the input video. Thus, only subtle changes, such as opening the mouth or adding wrinkles on the forehead are realized.
If you’ve ever chatted with someone using Animoji or Bitmoji, you’ve probably noticed how unnatural it is to see something familiar – at least for those of us who grew up watching cartoons – talking without moving its head and neck along with certain facial expressions. When someone frowns they usually dip their head and slump their shoulders, for example. These kinds of subtle movements are part of our body language, without them a talking head seems weird. HeadOn fixes this jarring issue and produces a far more natural result by bringing several advanced neural networks together.
It’s certainly not perfect yet; at HD resolutions there’s enough artifacts for all but the least-discerning viewers to realize the image has been manipulated. And, according to the researchers, the AI doesn’t quite know how to process long hair without producing glitchy output. But these kinds of minor issues aren’t likely to hold back future development.
Based on what we’ve seen, as this project has come together over the years, AI like HeadOn is likely to reach a point where it fools humans 99 percent of the time, sooner rather than later.