Meta Announces ‘Movie Gen’ AI Model: What Is It, How It Works, and More: Technology News

Meta has announced its new artificial intelligence model, Movie Gen, for generating video and audio from text prompts. Meta’s Movie Gen AI model competes with OpenAI’s Sora and can create videos based on user descriptions and generate accompanying audio. The company stated that it can also produce personalized videos using real photos of individuals to display them in different scenarios. The generated videos can be further enhanced or edited using text input. However, unlike the Llama series AI models, Meta Movie Gen is unlikely to release for open use by developers, Reuters reported.

Meta Movie Gen: What is it and how it works

Click here to contact us via WhatsApp

In a research paper describing the new AI model, Meta explained that the Movie Gen model is trained for both text-to-image and text-to-video tasks. When prompted, it generates different colored images, each of which serves as a frame for the video.

Meta stated that Movie Gen can produce high-definition (1080p) videos of up to 16 seconds at 16 frames per second (FPS). Within its parameters, the model can generate videos with variable resolutions and durations in different aspect ratios. The company noted that the model learned real-world images by “watching” videos and can reason about object motion, camera movement, subject-object interaction and more.

For audio generation, Meta said the Movie Gen model can produce corresponding audio using video-to-audio and text-to-audio techniques. The company claims it can generate 48kHz audio with cinematic sound effects and music synced to the video input. While the model’s video generation capabilities are limited to a few seconds, it can “create long coherent audio for videos up to several minutes long.”

Meta Movie Gen: Notable Features

Meta stated that the Movie Gen model is trained to condition both text and images, allowing it to generate videos featuring a selected person based on an actual image. The company assured that the video will preserve the identity of the person, while the actions will be based on the user’s prompt.

Moreover, the model has video editing capabilities for both generated content and real videos. The company claimed that Movie Gen can make “precise and imaginative edits” to a provided video based on the user’s description. In a preview shown by the company, the model successfully edited the background of a video and added additional elements to the main subject.

First publication: Oct 07, 2024 | 12:52 pm IST