After the release of Llama 3.2 multimodal models, Meta has announced a state-of-the-art video generation model called Movie Gen. Meta says the foundation model is not just limited to AI video generation, but can also produce images, audio, and even edit videos.

In that way, Movie Gen is a frontier media foundation model. Bear in mind that Meta has not released the model or weights, but unveiled a paper showcasing Movie Gen’s capabilities. 🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date.



Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in… pic.twitter.com/NDOnyKOOyq— AI at Meta (@AIatMeta) October 4, 2024

First, the Movie Gen Video model is a 30B parameter model that can generate high-definition (HD) videos of up to 16 seconds with a simple text prompt. Meta says this model can also generate high-quality images. In the demos, the generated videos look very impressive, much better than the AI videos we have seen from Runway, Pika, or Luma. In fact, it looks as good as OpenAI’s Sora and Google’s Veo models.

Next, the Movie Gen Audio model is trained on 13B parameters and it’s uniquely powerful. You can directly feed a video into this model, and Movie Gen Audio generates high-fidelity music of up to 45 seconds synced to the video. Not just that, you can add your own prompt along with the video in case you want a particular kind of sound. This model can generate ambient sound, instrumental music, and foley sound effects.

Image Courtesy: Meta

Movie Gen also brings precise AI video editing. Meta says users can upload an existing or AI-generated video and it can perform targeted edits. Just like AI image editing, you can add, remove, or replace elements in videos using simple text prompts. Apart from that, users can also make broader changes like changing the background or adjusting the style.

Finally, the Personalized Videos feature lets you upload your own photo and Movie Gen can create a video while preserving the original character. It also promises natural movement in videos. Overall, it seems Meta has developed a frontier media model that tightly integrates video, audio, and images.