Skip to main content

A model that can create realistic animations of talking faces

 


In recent years, computer-generated animations of animals and humans have become increasingly detailed and realistic. Nonetheless, producing convincing animations of a character's face as it's talking remains a key challenge, as it typically entails the successful combination of a series of different audio and video elements.

A team of computer scientists at TCS Research in India has recently created a new model that can produce highly realistic talking face animations that integrate audio recordings with a character's head motions. This model, introduced in a paper presented at ICVGIP 2021, the twelfth Indian Conference on Computer Vision, Graphics and Image Processing, could be used to create more convincing virtual avatars, digital assistants, and animated movies.

A team of computer scientists at TCS Research in India has recently created a new model that can produce highly realistic talking face animations that integrate audio recordings with a character's head motions. This model, introduced in a paper presented at ICVGIP 2021, the twelfth Indian Conference on Computer Vision, Graphics and Image Processing, could be used to create more convincing virtual avatars, digital assistants, and animated movies.

"For a pleasant viewing experience, the perception of realism is of utmost importance, and despite recent research advances, the generation of a realistic talking face remains a challenging research problem," Brojeshwar Bhowmick, one of the researchers who carried out the study, told TechXplore. "Alongside accurate lip synchronization, realistic talking face animation requires other attributes of realism such as natural eye blinks, head motions and preserving identity information of arbitrary target faces."

Most existing speech-driven methods for generating face animations focus on ensuring a good synchronization between lip movements and recorded speech, preserving a character's identity and ensuring that it occasionally blinks its eyes. A few of these methods also tried to generate convincing head movements, primarily by emulating those performed by human speakers in a short training video.

"These methods derive the head's  from the driving video, which can be uncorrelated with the current speech content and hence appear unrealistic for the animation of long speeches," Bhowmick said. "In general, head motion is largely dependent upon the prosodic information of the speech at a current time window."

Past studies have found that there is a strong correlation between the head movements performed by human speakers and both the pitch and amplitude of their voice. These findings inspired Bhowmick and his colleagues to create a new method that can produce head motions for face animations that reflect a character's voice and what he/she is saying.

In one of their previous papers, the researchers presented a generative adversarial network (GAN)-based architecture that could generate convincing animations of faces talking. While this technique was promising, it could only produce animations in which the head of speakers did not move.

"We now developed a complete speech-driven realistic facial animation pipeline that generates talking face videos with accurate lip-sync, natural eye-blinks and realistic head motion, by devising a hierarchical approach for disentangled learning of motion and texture," Bhowmick said. "We learn speech-induced motion on facial landmarks, and use the landmarks to generate the texture of the animation video frames."

The new generative model created by Bhowmick and his colleagues can effectively generate speech-driven and realistic head movements for animated talking faces, which are strongly correlated with a speaker's vocal characteristics and what he/she is saying. Just like the technique they created in the past, this new model is based on GANs, a class of machine learning algorithms that has been found to be highly promising for generating artificial content.

The model can identify what a speaker is talking about and his/her voice's  during specific time windows. Subsequently, it uses this information to produce matching and correlated head movements.

"Our method is fundamentally different from state-of-the-art methods that focus on generating person-specific talking style from the target subject's sample driving video," Bhowmick said. "Given that the relationship between the audio and head motion is not unique, our attention mechanism tries to learn the importance of local audio features to the local head motion keeping the prediction smooth over time, without requiring any input driving video at test time. We also use meta-learning for texture generation, as it helps to quickly adapt to unknown  using very few images at test time."

Bhowmick and his colleagues evaluated their model on a series of benchmark datasets, comparing its performance to that of state-of-the-art techniques developed in the past. They found that it could generate highly convincing animations with excellent lip synchronization, natural eye blinks, and speech-coherent head motions.

"Our work is a step further towards achieving realistic talking face animations that can translate into multiple real-world applications, such as digital assistants,  dubbing or telepresence," Bhowmick added. "In our next studies, we plan to integrate realistic facial expressions and emotions alongside lip sync, eye blinks and speech-coherent  motion."

Comments

Popular posts from this blog

Genshin Impact: Best Nahida Build Guide

The best build for Nahida in Genshin Impact is quite complicated as players need to balance her stat out. So far, Archons in  Genshin Impact  have not been disappointing. Venti is still a top-tier crowd controller, Zhongli is the  best shielder in  Genshin Impact , and Raiden Shogun is a tremendously powerful Sub-DPS and battery. Sadly, while many expect Nahida to take a whole new role, she, like Raiden, is also a Sub-DPS. For the Dendro Archon to become a powerful Sub-DPS, players must be familiar with Nahida’s best build in  Genshin Impact . The best build for Nahida in  Genshin Impact highly relies on her team composition  and how good players’ artifacts are. To put it simply,  Nahida has an obvious diminishing return, so Travelers must consider all sorts of stat buffs from Nahida’s team before determining her best build . This is because Nahida’s fourth-ascension passive (A4) allows her to buff her Skill DMG based on her Elemental Mastery (EM)...

NASCAR Heat 5 2022 Season Update available as of 22nd June

  An update to NASCAR Heat 5 that includes the 2022 NASCAR Cup Series season and the NASCAR NEXT Gen car released in DLC form on 22nd June for $9.99. The long-awaited 2022 Season Update to NASCAR Heat 5 has finally released as of Thursday (22nd June). Links to the DLC were made public on Steam and the PlayStation Store earlier in the week and the content unlocked around 10:00 pm BST / 5:00 pm EST on PlayStation, Xbox and PC via Steam. Originally planned to release in October of 2022, the update to the title from 2020 was delayed for quite some time. The predecessor to NASCAR 21: Ignition has been the base for the last two releases from Motorsport Games, in the NASCAR Heat Ultimate+ and NASCAR Rivals releases on the Nintendo Switch the last two years. Here’s what is included for the price point of $9.99. This DLC will be playable in Race Now, Career and Online Multiplayer modes. What this DLC will include is, first and foremost, the 2022 NASCAR Cup Series in terms of the cars. That’...

Pokémon Dev Job Listing References R&D For Next-Generation Hardware

It seems the Pokémon developer Creatures could already looking toward the future of the long-running series. A new job listing at the Japanese company for a 3DCG modeler references "research and development for other next-generation hardware". The same application also mentions the use of Unity and Unreal Engine for project development. Creatures Inc. is one of the major Pokémon developers alongside Game Freak and Nintendo. It previously helped out with Pokémon Sword and Shield and Pokémon: Let’s Go, Pikachu! and Let's Go, Eevee! on 3DCG modelling. It also worked on Detective Pikachu and is currently developing a sequel. In September, a senior programmer job profile at Creatures referenced work on "one unannounced project" and  Detective Pikachu 2  which is apparently "nearing release".

Warzone 2.0 Is Ditching 2v2 Gulags For Boring 1v1 Showdowns

  First came the much-requested changes to   loadout drops , now   Warzone 2.0   is messing with the gulag. With season two’s arrival on February 15, the gulag will no longer make players team up in pairs of two after suffering defeat on the battle royale map. Instead, as in the original   Warzone , it will now focus on 1v1 skirmishes. Unlike in other battle royales,  Call of Duty ’s  Warzone  allows players a chance to jump back into the action after dying. If you’re killed by another player early enough in the game, you’ll get sent to the gulag (there had to have been a better name for that). Here, you’ll square off in a quick deathmatch mode to earn your place back on the main map.  Warzone   2.0 ’s gulag has two teams of two face off, with both members of the winning team rejoining the game. Around halfway through the gulag match, a high-damage-output, bullet-spongey NPC called “the Jailer” will emerge. If the Jailer is defeated, bot...

Gran Turismo 7’s Next Update is Coming This Week, With Four New Cars

​ Gran Turismo series founder Kazunori Yamauchi has taken to Twitter to announce and tease the next update to  Gran Turismo 7 , which will arrive on consoles this coming week. It’s coming a little sooner than anticipated, landing around three weeks after the previous content update on September 29. That’s the shortest interval between content updates yet, with most coming four or five weeks apart — and, with the exception of 1.15 and 1.17, on the last Thursday of each month. Another unusual facet is that the update, which we’re temporarily dubbing 1.25, will bring four new cars instead of the three we’ve seen teased for every update thus far: This set of cars looks relatively easy to identify, although with some qualifiers. Probably the most straightforward is the one in the bottom-left, which looks to be a 1973 Nissan Skyline GT-R. Often dubbed “Kenmeri”, due to a promotional campaign featuring a couple named Ken and Mary, Nissan only produced 19...