Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se

2025/04/2918:33:36 news 1000

Xiao Xiao from Aofei Temple
qubit | Official account QbitAI

Let the 3D animation villain do a set of silky actions. How long does it take to render manually?

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

is now handed over to AI, and you can get by entering a few sentences ( different color color represents different movements) :

Look at the ground and grab the golf club, swing the club, trot for a while, and squat down.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Previously, AI-controlled 3D mannequins could only "do one action at a time" or "complete one instruction at a time", and it was difficult to complete the instructions continuously.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each set of actions, and the whole process is smooth and bug-free.

The new AI is called TEACH, from MapSouth and Gustav Eiffel University.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Netizens have a lot of ideas:

In this way, can you do it by just using scripts?

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Obviously, the gaming and simulation industries can consider it.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

So, how did such a 3D character action artifact come from?

uses an encoder to "remember" the architecture of the previous action

TEACH, based on another 3D human motion generation framework proposed by the team not long ago TEMOS.

TEMOS is designed based on Transformer architecture and uses real movement data of the human body for training.

It will use two encoders during training, namely the action encoder (Motion Encoder) and the text encoder (Text Encoder) , and the same is output through the action decoder (Motion Decoder) .

However, when used, the original action encoder will be "thrown away" and only the text encoder will be retained. In this way, after the model directly inputs text, the corresponding action can be output.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

is different from other AIs that input single text and output deterministic actions. TEMOS can generate a variety of different human movements through single text .

For example, a single instruction such as "people go around in circles" and "stop while standing for a few steps" can generate several different movement methods:

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

△The rotation method and walking pace are different

TEACH's architecture is based on TEMOS design, and the action encoder is moved directly from TEMOS.

But TEACH has redesigned the text encoder, which includes an encoder called Past Encoder, which provides the context of the previous action when generating each action to increase coherence between actions.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

If it is the first action in a series of instructions, Past Encoder is disabled. After all, there is no previous action to learn.

TEACH is trained on the BABEL dataset, a 43-hour motion capture dataset that contains transitional actions, overall abstract actions, and specific actions for each frame.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

During training, this series of motion capture data of BABEL will be divided into many subsets, each subset contains some transition actions, allowing TEACH to learn to transition and output.

As for why it is not used to train with another dataset KIT, the authors also gave their own opinions.

For example, in verb types, BABEL appears more specific than KIT, which prefers to use "fuzzy" words like do/perform.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

researchers compared TEACH with TEMOS on the generation effect of continuous action.

is better than TEMOS

First, let’s take a look at the effect of TEACH generating a series of actions, and it is not repeated continuously:

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Then, the researchers compared TEMOS with TEACH.

They trained the TEMOS model using two methods and called them Independent and Joint respectively. The difference lies in the data used for training.

In which Independent directly trains with a single action, and integrates the two actions in the first and last, spherical linear interpolation, etc. during generation; Joint directly uses the language tags separated by the action pair and the separated action tags as input.

Slerp is a linear interpolation operation, which is mainly used to smooth interpolation between two quaternions representing rotation, making the transformation process look smoother.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

to generate two consecutive movements as an example.

Independent has the worst performance, the character sat down on the spot; the Joint effect is better, but the character does not raise his left hand; the best effect is TEACH, after waving his right hand, he raised his left hand, and finally let it go.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

test on the BABEL dataset shows that the generation error of TEACH is the lowest, and independent and Joint perform well.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

researchers also measured the best frame count for using the previous action, and found that when the 5 frame of the previous action was used, the transition action generated was the best.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

The author introduces

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Nikos Athanasiou, a graduate student who Map is studying for. His research direction is multimodal AI, and he likes to explore the relationship behind human actions and language.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Mathis Petrovich studied for a PhD at the University of Gustave Eiffel , and also worked at the Map & P.M. The research direction is to generate real and diverse human movement based on labels or text descriptions.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Michael J. Black, director of the Marx Planck Institute for Intelligent Systems, has now cited 62,000+ times in Google academic papers.

Now leave it to the AI, and you can do it with a few words: look at the ground and grab the golf club, swing the club, trot for a while, and squat down. Now, there is no need to edit or edit, just enter a few commands in order, and 3D characters can automatically complete each se - DayDayNews

Gul Varol, assistant professor at Gustav Eiffel University, research directions are computer vision , video feature learning, human motion analysis, etc.

TEACH is currently open source. Interested friends can click the address below to experience it~

GitHub Address:
https://github.com/athn-nik/teach

Paper Address:
https://arxiv.org/abs/2209.04066

— End —

Quantum QbitAI · Toutiao Sign

Follow us and learn about cutting-edge technology dynamics

Quantum QbitAI · Toutiao Sign

Follow us and learn about the first time in the cutting-edge technology dynamics

news Category Latest News