Google’s Gemini Omni can generate ‘anything from any input,’ starting with video

Google didn’t forget AI creators during its latest round of Gemini announcements at Google I/O. The company has just officially unveiled Gemini Omni, a new model capable of “creating anything from any input, starting with video,” according to Google. The first model called Gemini Omni Flash is rolling out today to the Gemini app, Google Flow and YouTube Shorts.

Google called Gemini Omni the “next step” from Nano Banana and, presumably, its current video generator, Veo 3.1. It allows you to “combine images, audio, video and text input and generate high-quality videos based on Gemini’s real-world insights,” according to the tech giant. You can then edit these videos through natural conversation, with each instruction building on the last to keep characters and other elements consistent.

Where Veo 3.1 was limited to video creations via prompts and images, Gemini Omni will accept a wider range of inputs and do much more. For example, you can shoot a video and then simply ask Omni to change what’s happening. “Your video becomes the starting point for something you could never have filmed yourself,” Google explained. “Change the action, add new characters or objects, or turn a moment into something unexpected. Change the environment, angle, style, or even specific details.”

Omni also better understands physical forces such as gravity, kinetic energy and fluid dynamics, so scenes look more realistic. He combines this with Gemini’s “knowledge of history, science, and cultural context, bridging the gap between photorealism and meaningful storytelling.” The app is meant to create compelling explanations from short prompts to generate visuals that break down more complex ideas. However, it will only support voice references to start audio output.

If you want to generate videos in which you are the star, Omni lets you use your own voice to create a digital avatar that looks like you. If this sounds like a potential privacy nightmare, Google says it has “clear policies to protect users from harm and govern the use of our AI tools.” As for editing videos to change audio and speech, the company is still testing this feature in order to bring it to users in a “responsible” way. All videos will also use Google’s imperceptible SynthID digital watermark to verify that the videos were generated with Gemini Omni.

This all sounds great, but the main problem with Veo 3.1 and other video generation applications is that the video has an “uncanny valley” look and is often hated by end users. To that end, it will be interesting to see if the output quality matches Google’s breathless claims. We’ll find out soon, as Gemini Omni Flash is now available to all Google AI Plus, Pro, and Ultra subscribers globally and will be rolling out to users of YouTube Shorts and the YouTube Create app starting this week.