Share this article

Researchers of Max Planck Institute for Informatics, Princeton University, and Stanford University in collaboration with Adobe created a new artificial intelligence algorithm that allows modifying video speeches by simply changing the transcript.

The algorithm goes a step further by editing the lip movements and adjusting the pronunciation of the edited words to be similar in tone as the speaker. Currently, the algorithm needs a minimum of 40 minutes of the original video for it to work effectively.

โ€œThe application uses the new transcript to extract speech motions from various video pieces and, using machine learning, convert those into a final video that appears natural to the viewer โ€“ lip-synched and allโ€

– Algorithm’s researchers

Misinformation and online falsehoods

The algorithm was originally developed to assist with movie post-production which currently requires tedious manual editing work. The researchers, however, are wary of the concerns about the potential for misuse.

โ€œWe acknowledge that bad actors might use such technologies to falsify personal statements and slander prominent individuals. We are concerned about such deception and misuse.โ€

– Algorithm’s researchers

The researchers suggest that to provide safeguards against misuse, the research community can further develop “forensics, fingerprinting and verification techniques to identify manipulated video”.

They added that “robust public conversation is necessary to create a set of appropriate regulations and laws that would balance the risks of misuse of these tools against the importance of creative, consensual use cases.”

See the algorithm in action below:

Here’s the research paper (PDF):
1906.01524