“Please don’t make people believe that hyperrealism can be done cheaply. It’s wrong today,” says Vincent Haeffner, production manager at Effigy, a studio specializing in digital creation and human modeling.
However, this technology, Mark Zuckerberg tried to sell it to the general public on October 11. During the Meta Connect conference, the head of Facebook appeared in the form of an ultra-realistic avatar. Guaranteed wow effect as virtual features, facial expressions and skin texture looked more real than life.
The Avatar Codec technology -now in version 2.0- is also one of the great challenges of Facebook to immerse us in its universe, or rather in its metaverse. Better yet, the group hopes to do it “inexpensively” by using a smartphone to scan your face in a few seconds.
Big amount of data
Presented in 2019, the Avatar Codec project aims to break the barriers of virtual reality to reproduce the authenticity of an exchange between two people in physics. The idea is to no longer distinguish the avatar from the person thanks to extremely faithful modeling. But for that, a visit to a motion capture studio is a must. Heavy technology for the common mortals.
In its American laboratory in Pittsburgh, Pennsylvania, Facebook has installed two such devices. In total, hundreds of cameras capture data at a rate of 1 GB per second. In 2019, the company said the capture took around 15 minutes. To symbolize the gigantic nature of such an operation, a computer equipped with a 512GB hard drive would become saturated in the space of three seconds.
This huge amount of data is then processed by photogrammetry. This technique makes it possible to determine the dimensions and volumes of an object -here a face- from measurements taken on photographs that show the perspectives of these objects.
10 months to create Aya Nakamura’s avatar
“There are several stages, explains Vincent Haeffner. First you have to create the volume of the face. Then you need to set up a whole animation system of the skeleton, then the oral cavity, the mouth, the tongue. Once the model is ready to animate, it’s almost easy.”
Today, the maneuver remains artisanal, confirms Louis de Castro, the head of Mado XR. Specialized in digital creation, this young French company worked on the Aya Nakamura show in Fortnite at the beginning of October. She was responsible for broadcasting the video on the giant video game screens during the Franco-Malian singer’s interactive show.
One more proof that creating a realistic 3D avatar is not a quick thing. In addition, Mark Zuckerberg is well aware of the brakes that his technology can represent. That’s why he introduced the Avatar instant codec.
this is a “degraded” version of the technology: no capture studio is needed, but the result is still amazing, as shown in the presentation video.
Therefore, a smartphone would suffice in case of sufficient light. However, during the test, the company was careful to use an iPhone because the Apple device has a lidar sensor that helps a lot in capturing faces correctly.
For two minutes, the person films himself with a neutral face and then makes expressions. Then this video is sent to Facebook servers, which cut it into images. This data is then processed by a dedicated computer or server. Just several hours later, the person has their avatar ready to be animated. As expected, Facebook is trying to reduce its processing time to return an avatar to its users after submitting their video.
But aside from this degraded solution, the use of which might suffice in the metaverse, Mark Zuckerberg’s ultra-realistic version poses a problem. It’s not so much about modeling: since this summer, Epic Games has made available free tools to design your own avatar from photos. The result is certainly less fine-grained than Codec Avatar 2.0, but more realistic than Instant Codec Avatar.
Google is also on the way
The real difficulty for Meta comes from the flow of data to be processed. It’s too important to expect real-time avatar animation in the metaverse. Especially since the head of Facebook plays with the reflections of light on his face, a function that consumes many computer calculations.
Google should also confront this limitation due to the amount of data to be managed with its Starline Project. The company is working on a screen that allows filming a person and showing their interlocutor. The promise is to trick the human eye into simulating real face-to-face conversations, but at a distance.
To achieve this, more than a dozen cameras and sensors film and follow the person. The feat lies in instantly compressing this data to send it instantly to your caller’s screen. In addition, it is necessary to have a 3D screen that allows you to simulate the relief of a person sitting in front of you.
If the rendering is immersive, it’s still not perfect, create The Verge. An early access program has been set up to equip over 100 companies with this tool, which Andrew Nartker, director of product management at Project Starline, affectionately refers to as the “magic window.”
Due to their financial or technological barriers, these technologies do not yet seem to be within the reach of the general public. Social interactions need more realism to overcome the sense of isolation that is sometimes felt. But for now, we’ll have to settle for avatars straight out of a cartoon… and no legs.
Source: BFM TV
