
Note: Everything about Gemini Omni in this article is based on unconfirmed leaks and community speculation. Google has not officially announced this model, confirmed its capabilities, or committed to a release timeline. We'll update this post if and when official information becomes available.
Google I/O 2026 is one week away, and unverified demos of what appears to be a new video generation feature called "Gemini Omni" have surfaced on social media. Here's what the leaks show, what they might mean if accurate, and how AnyCap would approach integrating such a model.
Video Models Available on AnyCap Today
Omni is still speculation — but if you're looking to work with AI video generation right now, these models are live on AnyCap under a single API key:
| Model | Provider | Best For |
|---|---|---|
| Veo 3.1 | Cinematic camera work, audio-visual sync | |
| Seedance 2.0 | ByteDance | Top benchmark performance, Fast/Turbo variants |
| Wan 2.7 | Alibaba | 1080p output, audio-synced motion |
| Kling V3.0 | Kuaishou | High-fidelity, Std/Pro/O3 variants |
| Sora 2 | OpenAI | API-accessible video generation |
| Nano Banana 2 | Fast image generation and editing | |
| Nano Banana Pro | High-fidelity image generation |
All models share the same API endpoint, billing, and authentication. No separate SDKs or per-model contracts.
→ Browse the full AnyCap model catalog
What the Leaks Show (Unverified)
On May 2, a Reddit user shared a screenshot from the Gemini mobile app showing the text: "Start with an idea or try a template. Powered by Omni." The label appeared alongside "Toucan," which is reportedly Google's internal codename for the current Veo 3.1-powered video path. The screenshot has not been independently verified.
The UI description reportedly reads:
Meet our new video generation model. Remix your videos, edit directly in chat, try a template, and more.
This text comes from a leaked screenshot and has not been confirmed by Google.
Three Possibilities (All Speculative)
If the leaks reflect a real product in development, the AI community has discussed several interpretations — all speculative until Google provides official information:
| # | Possibility | Confidence | Notes |
|---|---|---|---|
| 1 | Veo rebrand — Omni is a new product name for the existing Veo pipeline | Unknown | Would be a cosmetic change if true |
| 2 | New video model — A different architecture trained under Gemini | Unknown | Only Google knows the underlying tech |
| 3 | Unified omni-model — Single system for text + image + video + audio | Highly speculative | The name invites this interpretation, but no evidence confirms it |
None of these possibilities has been confirmed.
What the Leaked Clips Show
The demos circulating online are unverified — it is not possible to independently confirm they were generated by Gemini Omni, or that they reflect the model's typical output quality.
A Professor at a Chalkboard (Unverified)
A widely circulated clip shows what appears to be a professor writing mathematical formulas on a chalkboard while explaining the derivation. Observers have noted that the formulas in the clip appear to be correct and the chalk writing is coherent. This clip's provenance has not been independently verified.
Text rendering in AI-generated video has historically been unreliable. If this clip is genuine and representative, it would suggest progress in that area — but without official confirmation or independent testing, no firm conclusion can be drawn.
A Restaurant Scene (Unverified)
Another leaked clip shows two men eating spaghetti at a restaurant. This references the well-known "Will Smith eating spaghetti" benchmark used informally to evaluate AI video quality. The source and authenticity of this clip are unverified.
Editing Features (Unverified)
Leaked screenshots suggest editing capabilities including watermark removal and object replacement through the chat interface. These features have not been confirmed by Google, and it is unclear whether they reflect a finished product or an internal test.
How This Compares to Google's Current Approach
Looking at what Google has actually released (not leaked):
- Nano Banana 2 and Pro: Google's publicly available AI image generation models, integrated into Gemini. These generate and edit images through the chat interface.
- Veo 3.1: Google's publicly available video generation model, accessible through Gemini but operating as a separate pipeline labeled "Powered by Veo 3.1."
The leaks have led some outlets — including 36Kr — to describe Omni as a potential "video version of Nano Banana." This is an analogy, not Google's official positioning, and may or may not reflect the actual product.
The AI Video Landscape (Current, Confirmed)
For context, here are the major AI video models that are publicly available as of May 2026:
| Model | Company | Status |
|---|---|---|
| Seedance 2.0 | ByteDance | Publicly available |
| HappyHorse-1.0 | Alibaba | Publicly available |
| Wan 2.7 | Alibaba | Publicly available |
| Kling V3.0 | Kuaishou | Publicly available |
| Sora 2 | OpenAI | API only (consumer app shut down April 29, 2026) |
| Veo 3.1 | Available through Gemini (region-limited) |
OpenAI confirmed the shutdown of the Sora consumer app on April 29, 2026. Google has publicly stated that video generation remains part of its roadmap.
Gemini Omni does not appear on this list because it has not been officially announced.
AnyCap's Position
AnyCap is an AI capability platform that aggregates generative AI models — image, video, audio, search — under a single API. The video models listed at the top of this post (Veo 3.1, Seedance 2.0, Wan 2.7, Kling V3.0, Sora 2, Nano Banana 2, and Nano Banana Pro) are all available today on the AnyCap model catalog.
Regarding Gemini Omni:
- Google has not announced this model or confirmed API access.
- If Google releases Omni with API availability, AnyCap would evaluate the model and aim to integrate it.
- There is no confirmed timeline for this, as none exists from Google.
- AnyCap does not have early or privileged access to unannounced Google products.
What to Watch
Google I/O 2026 runs May 19–20, 2026. Google typically uses this event to announce product updates, but the specific agenda has not been confirmed. Whether or not Omni is discussed, AnyCap will monitor official announcements and assess integration opportunities as they arise.
Summary
Unverified leaks suggest Google may be developing a native video generation experience for Gemini under the name "Omni." The circulating clips are interesting but unconfirmed. Google has not announced this product, described its capabilities, or provided a release timeline.
If Omni does launch with API access, AnyCap intends to evaluate and integrate it, as the platform does with major new models. In the meantime, seven video and image generation models are already available on AnyCap, all accessible through a single API key.
This post will be updated if and when Google provides official information about Gemini Omni.