LOGO

Turn Photos into Videos with d-id's Speaking Portrait

September 23, 2021
Turn Photos into Videos with d-id's Speaking Portrait

Revolutionizing Visual Media: D-ID's Speaking Portraits

The technology firm responsible for the innovative MyHeritage application – renowned for animating classic photographs – has unveiled a new iteration of its capabilities. This latest development focuses on transforming static images into remarkably lifelike videos, capable of articulating any desired message.

Distinguishing Speaking Portraits from Deepfakes

D-ID’s Speaking Portraits, while visually similar to the widely discussed “deepfakes,” utilizes fundamentally different technology. Notably, the basic functionality doesn't necessitate any prior training data.

A Shift in Focus and Diverse Applications

D-ID initially emerged at TechCrunch Battlefield in 2018, concentrating on facial recognition obfuscation. However, the company debuted its Speaking Portraits product during TechCrunch Disrupt 2021.

Demonstrations showcased a variety of potential applications, including the creation of a multilingual television presenter capable of conveying diverse emotions. Further uses include developing virtual chatbot personalities for customer service, designing training modules for professional advancement, and establishing interactive, conversational video advertising kiosks.

From Facial Recognition to Animated Portraits

The current product line and D-ID’s collaboration with MyHeritage – which propelled the latter’s app to the forefront of Apple’s App Store rankings – represent significant deviations from the company’s original objectives. Even as recently as May of the previous year, D-ID was securing funding based on its initial technology.

However, the partnership with MyHeritage debuted in February, followed by a similar agreement with GoodTrust and a prominent collaboration with Warner Bros. on the Hugh Jackman film “Reminiscence,” allowing fans to integrate themselves into the film’s trailer.

Market Demand Drives Innovation

D-ID’s strategic shift may appear substantial, but CEO and co-founder Gil Perry explained that the decision stemmed from recognizing a substantial market opportunity within this application area. The potential for growth was clearly evident.

The success of major clients like Warner Bros., alongside the App Store dominance achieved by MyHeritage, appears to validate this assessment. Speaking Portraits caters to both large corporations and individual users, enabling the generation of full HD video from a single image, coupled with either pre-recorded audio or typed text.

Multilingual Support and Future Expansion

D-ID is launching the product with initial support for English, Spanish, and Japanese, with plans to incorporate additional languages based on customer demand.

Portrait Options: Single Image vs. Trained Character

D-ID offers two primary Speaking Portrait categories. The “Single Portrait” option utilizes a single still image, animating the head while maintaining a static background. It functions with the original image’s existing backdrop.

For enhanced realism, the “Trained Character” option requires a 10-minute training video of the subject, adhering to the company’s specified guidelines. This allows for use with a customizable, interchangeable background and includes preset animation choices for the character’s body and hands.

Demonstrating Realism

Below is an example of a Speaking Portrait newscaster generated using the trained character method, illustrating the level of realism achievable:

The demonstration showcased at Disrupt involved a childhood photograph of Perry himself. The image was synchronized with facial expressions performed by a live actor who also provided the voiceover for the Speaking Portrait version of Gil. A video demonstrating the mirroring of expressions between the actor and the animated photo is available below:

Ethical Considerations and Responsible Use

The capacity to generate photo-realistic videos from a single image, capable of convincingly delivering any script, raises significant ethical concerns. Extensive discussions have already taken place regarding the ethics of deepfakes, alongside industry initiatives to identify and fingerprint AI-generated content.

Commitment to Transparency and Consent

Perry stated at Disrupt that D-ID is “dedicated to ensuring responsible use” of its technology. To this end, the company will release a pledge at the end of October, in collaboration with partners, outlining its commitment to “transparency and consent.”

The goal of this commitment is to guarantee that “viewers are not misled and that individuals involved provide their consent.”

A Collaborative Approach to Preventing Misuse

While D-ID aims to establish safeguards within its terms of use and public statements, Perry emphasized that it “cannot address this challenge independently.” He is therefore urging others within the industry to collaborate in preventing misuse of the technology.

#d-id#speaking portrait#photo to video#AI video#realistic video#custom video