Google Gemini Omni: AI Model That Creates Anything From Any Input

I watched a short clip replay on my phone and felt a small, sudden alarm: the reflection in the mirror rippled differently each time someone issued a new prompt. You could see the room change with a few words, and the moment where reality stopped being fixed arrived faster than anyone expected. Minutes later Google announced a model that promises to remake video from any input.

I cover technology and risk. You should care because tools that change how we make and assess moving images will change what we trust.

At Google I/O, engineers showed a man touching a mirror and then rewrote the clip mid-play.

Google introduced Gemini Omni and its first family member, Gemini Omni Flash, as a model that edits and generates video from plain-language prompts. The company literally invited you to think of it as “Nano Banana — but for video,” a shorthand that ties Omni to the image model Google released last year.

Omni is already available inside the Gemini app, in Google Flow AI studio, and on YouTube Shorts, where creators can test edits and new clips directly. The pitch is simple: supply images, audio, video or text, and the model will produce video grounded in Gemini’s database of real-world knowledge.

Omni feels like a camera crew compressed into a single sentence.

What is Gemini Omni?

Gemini Omni is a multimodal video model from Google DeepMind that accepts mixed inputs and returns edited or newly generated video. According to Koray Kavukcuoglu, Google DeepMind’s CTO, you can keep asking for changes through conversation and the model will maintain visual consistency across edits—characters, lighting and environments that stick together through multiple prompts.

In demos, a single source clip became liquid mirrors, voxel art, and a claymation explainer.

Google posted several short examples: a man touches a mirror and the reflection ripples; the same touch turns the scene into 3D voxel art; window lights sync to a techno track; a quick claymation explainer on protein folding appears. Those examples show two things: novelty and narrative control.

The model uses language understanding plus models of physics, biology and story logic to keep edits coherent. That matters when you want a clip that makes sense, not just a sequence of flashy frames. Omni Flash is designed to follow up edits conversationally, building iterations that reference earlier changes.

How do I use Gemini Omni?

Hands-on use starts in the Gemini app or Flow AI studio: upload a clip, type a prompt like “make the mirror ripple beautifully like liquid,” and let the model generate variants. YouTube Shorts offers a surface-level integration for creators who want quick edits. Under the hood, Omni aims to combine multiple media inputs and real-world knowledge to produce a finished clip.

On Google’s site, several sample reels make the technology feel both playful and unsettling.

The playful demos hide the harder questions: how will this be used in journalism, politics, advertising, or fraud? Google acknowledges the risks—deepfakes and disinformation rank high—and says it routed Omni through internal safety, security and responsibility teams as well as outside specialists before release.

Google says the model includes a hidden marker; I saw the promise of a traceable tag in the press materials.

Google plans to attach an invisible SynthID digital watermark to content created or edited with Omni. The SynthID watermark is a digital fingerprint — subtle, hidden but traceable. That mechanism is meant to make it easier for platforms and tools to identify AI-generated material without visible labels on every frame.

Can Gemini Omni create deepfakes?

Yes — the same capabilities that let you change a mirror to voxel art can alter faces, voices and scenes in ways that mimic reality. Google claims mitigations: safety reviews, outside testing, and the SynthID watermark. Platforms like YouTube will add policy layers, and third-party detection tools will try to keep pace, but history shows that safeguards and bad actors race in different directions.

You can test Omni today in the Gemini app, on Flow AI studio, or via YouTube Shorts; you can also expect lawyers, journalists, and creatives to push at the edges of what the model will and won’t do. When every clip can be remade by a prompt, who owns the truth?