Google’s Gemini Omni can generate ‘anything from any input,’ starting with video

Google didn’t forget AI creators during its latest round of Gemini announcements at Google I/O. The company has just officially unveiled Gemini Omni, a new model capable of “creating anything from any input, starting with video,” according to Google. The first model called Gemini Omni Flash is rolling out today to the Gemini app, Google Flow and YouTube Shorts.

Google called Gemini Omni the “next step” from Nano Banana and, presumably, its current video generator, Veo 3.1. It allows you to “combine images, audio, video and text input and generate high-quality videos based on Gemini’s real-world insights,” according to the tech giant. You can then edit these videos through natural conversation, with each instruction building on the last to keep characters and other elements consistent.

Where Veo 3.1 was limited to video creations via prompts and images, Gemini Omni will accept a wider range of inputs and do much more. For example, you can shoot a video and then simply ask Omni to change what’s happening. “Your video becomes the starting point for something you could never have filmed yourself,” Google explained. “Change the action, add new characters or objects, or turn a moment into something unexpected. Change the environment, angle, style, or even specific details.”

Omni also better understands physical forces such as gravity, kinetic energy and fluid dynamics, so scenes look more realistic. He combines this with Gemini’s “knowledge of history, science, and cultural context, bridging the gap between photorealism and meaningful storytelling.” The app is meant to create compelling explanations from short prompts to generate visuals that break down more complex ideas. However, it will only support voice references to start audio output.

If you want to generate videos in which you are the star, Omni lets you use your own voice to create a digital avatar that looks like you. If this sounds like a potential privacy nightmare, Google says it has “clear policies to protect users from harm and govern the use of our AI tools.” As for editing videos to change audio and speech, the company is still testing this feature in order to bring it to users in a “responsible” way. All videos will also use Google’s imperceptible SynthID digital watermark to verify that the videos were generated with Gemini Omni.

This all sounds great, but the main problem with Veo 3.1 and other video generation applications is that the video has an “uncanny valley” look and is often hated by end users. To that end, it will be interesting to see if the output quality matches Google’s breathless claims. We’ll find out soon, as Gemini Omni Flash is now available to all Google AI Plus, Pro, and Ultra subscribers globally and will be rolling out to users of YouTube Shorts and the YouTube Create app starting this week.

Google’s Gemini Omni can generate ‘anything from any input,’ starting with video

ByAmelia Scott

By Amelia Scott

Related Post

Google offers new AI-powered ad formats for search

Google’s Gemini Spark is an agentic AI assistant

Google’s redesigned Gemini comes with a new interface and AI models

You missed

Google’s Gemini Omni can generate ‘anything from any input,’ starting with video

Kansas City Public Schools will replace 30,000 Windows PCs and Chromebooks with Apple devices

Google offers new AI-powered ad formats for search

Hovercraft is a new Mac app that makes video call presentations more personal

Woozad — Tech Intelligence Daily