IT Home reported on October 2 that Meta recently released an artificial intelligence system that can generate short videos based on text prompts. IT Home learned that the system, called Make-A-Video, allows users to enter a series of words, such as "a dog wearing a superhero cost

2025/04/1911:14:34 technology 1825

IT Home October 2nd News, Meta recently released an artificial intelligence system, which can generate short videos based on text prompts.

IT Home reported on October 2 that Meta recently released an artificial intelligence system that can generate short videos based on text prompts. IT Home learned that the system, called Make-A-Video, allows users to enter a series of words, such as

IT Home learned that the system is called Make-A-Video, allowing users to enter a series of words, such as "a dog wearing a superhero costume and a red cloak flying in the sky", and then generate a five-second short video. Although the effect of

is quite rough, this system is obviously more advanced than the text to picture AI system.

Last month, AI lab OpenAI provided everyone with its latest text-to-image AI system, DALL-E, while AI startup Stability.AI launched Stable Diffusion, an open source text-to-image system.

But text to video AI systems come with some bigger challenges. First, these models require a lot of computing power. They are more computationally intensive than large text-to-image AI models, which train with millions of images because just pieced together a short video requires hundreds of images. This means that only large tech companies have the ability to build these systems for the foreseeable future. They are also tricky to train because there is no dataset for large-scale high-quality video pairing with text.

To solve this problem, Meta combines data from three open source image and video datasets to train its model. Standard text-to-image dataset labeled static images help AI learn the names of objects and what they look like. A video database helps it learn how these objects should move around the world. The combination of these two methods helps Make-A-Video to generate videos at scale from text.

Meta says the technology can "bring new opportunities for creators and artists." However, as technology evolves, there are concerns that it may be used as a powerful tool for creating and disseminating misinformation and deep falsification, which may make it harder to differentiate between real and fake content online.

Created Make-A-Video researchers filter out aggressive pictures and text, but for a dataset composed of millions and millions of text and images, it is nearly impossible to completely remove biased and harmful content. A spokesperson for

Meta said that the model has not been provided to the public at this time, "as part of this research, we will continue to explore ways to further improve and reduce potential risks."

technology Category Latest News