Veo 3.1 API for AI Agents: Generate 1080p Video in One Command

Use Google's Veo 3.1 model inside any AI agent. Full API reference, CLI examples, model comparison table, and pricing breakdown — ready in 3 minutes.

Veo 3.1 — cinematic AI video generation with 4K quality rendered in real time

Yes — AI agents can call Veo 3.1 directly. Google's flagship video model is available through the AnyCap API and CLI, letting Claude Code, Cursor, Codex, and any other agent runtime generate up to 4K video in a single tool call.

What Is Veo 3.1?

Veo 3.1 is Google DeepMind's highest-quality text-to-video model as of 2026. It produces 4–8 second native clips — extendable up to 148 seconds via Scene Extension chaining — at up to 4K resolution, with cinematic lighting, coherent motion, and native 48kHz stereo audio. It is the go-to choice when you need a premium first-pass render that needs minimal post-processing.

Key specs:

Resolution: 720p, 1080p, or 4K (3840×2160 on premium tiers)
Duration: 4, 6, or 8 seconds native; chainable up to 148 seconds via Scene Extension
Audio: native 48kHz stereo AAC, synchronized dialogue and soundscapes
Output: MP4 with optional audio track
Strength: photorealistic first-pass quality, product demos, brand announcements
Credit cost: ~20 credits/second of output

Why Agents Can't Call Veo 3.1 Directly (and How to Fix It)

Veo 3.1 lives behind Google's Vertex AI platform, which requires OAuth 2.0 service-account auth, project IAM setup, and streaming gRPC calls — none of which a standard agent tool call can handle out of the box.

AnyCap wraps the entire auth and delivery layer behind a single REST endpoint and CLI binary, so your agent calls one tool, gets back a video URL, and moves on.

What Veo 3.1 Unlocks for Agents

Product demo videos — turn a spec sheet into a 30-second MP4 automatically
Announcement clips — generate a launch video the moment a PR draft is approved
Visual documentation — convert how-to instructions into narrated walkthroughs
A/B creative testing — generate multiple prompt variants at scale and compare

Method 1: Direct Vertex AI API (manual setup)

If you need raw access without AnyCap, the Vertex AI REST endpoint is:

POST https://us-central1-aiplatform.googleapis.com/v1/projects/{PROJECT}/locations/us-central1/publishers/google/models/veo-3.1:generateVideo
Authorization: Bearer $(gcloud auth print-access-token)
Content-Type: application/json

{
  "instances": [{
    "prompt": "A golden retriever running on a beach at sunset, slow motion",
    "parameters": {
      "durationSeconds": 10,
      "resolution": "1080p"
    }
  }]
}

Downsides: requires GCP project, service account JSON, IAM roles, and polling logic for async jobs. Not practical inside a sandboxed agent.

Method 2: MCP Server

AnyCap ships an MCP server that exposes video_generate as a tool. Add it to your mcp.json:

{
  "mcpServers": {
    "anycap": {
      "command": "anycap",
      "args": ["mcp", "serve"],
      "env": { "ANYCAP_API_KEY": "your_key_here" }
    }
  }
}

Once connected, instruct your agent:

Generate a 10-second product demo video of a sleek laptop opening on a white desk.
Use Veo 3.1. Save it to Drive and return the shareable link.

The agent calls video_generate → drive_upload → returns URL. No boilerplate.

Method 3: AnyCap CLI (Recommended for Agent Runtimes)

The CLI is the fastest path for Claude Code, Cursor terminals, and Codex sandbox shells.

Install

curl -fsSL https://anycap.ai/install.sh | sh
anycap login   # paste your API key once

Generate a video with Veo 3.1

anycap video generate \
  --model veo-3-1 \
  --prompt "A drone flyover of a modern city at golden hour, cinematic" \
  --duration 15 \
  --output /workspace/city-flyover.mp4

Output:

✓ Queued    veo-3-1  [job: v3x-7891]
✓ Rendering 15s @ 1080p …
✓ Complete  /workspace/city-flyover.mp4  (284 MB)
  Credits used: 300

Upload to Drive and get a shareable link

anycap drive upload /workspace/city-flyover.mp4 --share
# → https://drive.anycap.ai/f/abc123  (public link, no login required)

Image-to-video (reference frame)

anycap video generate \
  --model veo-3-1 \
  --image /workspace/product-shot.png \
  --prompt "Slowly rotate the product 360 degrees, studio lighting" \
  --duration 8 \
  --output /workspace/product-360.mp4

Veo 3.1 vs Other Video Models — Which Should Agents Use?

Model	Best For	Quality	Speed	Credits/sec
Veo 3.1	Premium first-pass, photorealism	★★★★★	Moderate	~20
Veo 3.1 Fast	Quick drafts, iteration	★★★★☆	Fast	~10
Kling 3.0	Cinematic camera motion, drama	★★★★★	Moderate	~18
Kling O1	Consistent style, batch	★★★★☆	Fast	~12
Seedance 2.0	Character consistency, series	★★★★☆	Moderate	~15
Seedance 1.5 Pro	High-volume batch generation	★★★★☆	Fast	~10
Sora 2 Pro	Realistic physics, long duration	★★★★★	Slow	~25
Hailuo 2.3	Fast drafts, stylized	★★★☆☆	Very Fast	~8

Decision guide:

Need the best possible quality on the first try → Veo 3.1
Need dramatic camera sweeps and motion → Kling 3.0
Need consistent characters across multiple clips → Seedance 2.0
Need realistic gravity/fluid physics → Sora 2 Pro
Need a fast cheap draft → Veo 3.1 Fast or Hailuo 2.3

Pricing

AnyCap credits are shared across all models. New accounts start with 250 free credits.

Clip Length	Veo 3.1 Cost	Veo 3.1 Fast Cost
5 seconds	100 credits	50 credits
10 seconds	200 credits	100 credits
30 seconds	600 credits	300 credits
60 seconds	1,200 credits	600 credits

View full pricing →

FAQ

Q: Does Veo 3.1 support audio?
A: Yes. Pass --audio to request a synthesized audio track. You can also specify --no-audio for a silent clip and add music separately with anycap music generate.

Q: What's the difference between Veo 3.1 and Veo 3.1 Fast?
A: Veo 3.1 Fast uses a distilled model that renders in roughly half the time at half the credit cost. Quality is slightly lower but acceptable for drafts and iteration. Switch to full Veo 3.1 for final renders.

Q: Can I run Veo 3.1 inside a Claude Code session?
A: Yes. Install the AnyCap CLI in your Claude Code project shell and call it as a bash tool. The output file path is returned synchronously, so subsequent tool calls can reference it immediately.

Q: How do I use Veo 3.1 inside Cursor?
A: Open the Cursor terminal, install the CLI, and run anycap video generate commands. Or add the AnyCap MCP server to .cursor/mcp.json and ask Cursor's agent to generate video through natural language.

Q: Is there a resolution limit?
A: Veo 3.1 supports 720p, 1080p, and 4K (3840×2160). 4K is available on premium API tiers via Vertex AI. Aspect ratio defaults to 16:9; pass --aspect 9:16 for vertical/mobile format.

Q: How long does a 10-second Veo 3.1 clip take to render?
A: Typical queue-to-download time is 60–90 seconds for a 10-second clip. Render time scales with duration. Veo 3.1 Fast is roughly 30–45 seconds for the same length.

Q: Do my generated videos belong to me?
A: Yes. Videos generated through AnyCap are fully owned by you. AnyCap does not use your prompts or outputs to train models.