How to Generate Images with Codex: 3 Methods (2026 Guide)

OpenAI Codex CLI can't generate images natively. Here's how to add image generation to Codex — via direct API, MCP server, or one CLI command covering Seedream 5, GPT Image 2, FLUX, and more.

by AnyCap

You're building with Codex CLI. It plans the implementation, writes the code, runs tests. Then you ask it to generate a product hero image or a UI mockup.

Codex stops. Image generation isn't in its native toolkit — same limitation as Claude Code, Cursor, and every other coding agent.

Here's how to add image generation to Codex. Three approaches, from manual integration to a single command.


Why Codex Doesn't Ship With Image Generation

Codex is OpenAI's agentic coding tool. It executes tasks in cloud sandboxes, plans across files, runs terminal commands, and handles the full development loop. Image generation is a separate model family — GPT Image 2, Seedream 5, FLUX.1, DALL-E — that runs on different infrastructure, updates independently, and requires its own API surface.

The gap is intentional. Codex stays focused on code; the capability layer is external. The question is how cleanly that capability plugs in.


What Codex + Image Generation Unlocks

When you add image generation to Codex, visuals become part of the build pipeline, not an afterthought:

  • Hero images for landing pages. Codex builds the page, generates the hero image, embeds the URL — same session.
  • UI mockups and design references. Describe a design direction, get a visual reference without leaving the terminal.
  • Launch assets on demand. Social graphics, announcement visuals, OG images — generated by your agent when it's building the thing they promote.
  • Image-to-video pipelines. Generate the still, then animate it. The same CLI handles both steps. See our complete image-to-video pipeline guide.

Method 1: Direct API Integration

Codex can execute shell commands. You can wire it directly to image generation APIs.

Step 1: Choose a provider. GPT Image 2 (OpenAI), Seedream 5 (ByteDance), FLUX.1 Kontext Max (Black Forest Labs), DALL-E 3 (OpenAI). Each has its own API format.

Step 2: Get API credentials. Separate developer console per provider. Separate API keys. Separate billing accounts.

Step 3: Write integration scripts. Codex calls your scripts with prompts. Your scripts handle auth, POST requests, async polling for generation jobs, file downloads, and output handling.

Step 4: Handle format differences. Different providers return different response formats. Base64, URLs, signed CDN links — you handle the normalization.

This works. But you end up maintaining integration code instead of generating images.


Method 2: MCP Server for Image Generation

MCP servers let Codex invoke external capabilities through a standard protocol:

  • Replicate MCP — access to hundreds of image models
  • FAL.ai MCP — fast inference for Flux models
  • Stability MCP — Stable Diffusion variants

Configure once per server. Codex calls them like any tool. Lighter than direct API wiring.

The limitation: a single-provider MCP server locks you to that provider's model selection. When you want to compare GPT Image 2 output against Seedream 5, you're adding a second server.


Method 3: One CLI Across Codex, Claude Code, and Cursor

This is the approach where your agent calls one command regardless of which image model you want:

anycap image generate \
  --prompt "a modern SaaS dashboard on a MacBook, floating UI elements, soft studio lighting, product photography style" \
  --model seedream-5 \
  -o hero.jpg

Change --model seedream-5 to --model gpt-image-2, --model flux-kontext-max, or --model nano-banana-2 — same command, different model. Codex, Claude Code, and Cursor all call the same CLI.

Install for Codex:

npx -y skills add anycap-ai/anycap -a codex -y
anycap login && anycap status

After install, Codex recognizes anycap image generate as an available command in its shell environment.

Install AnyCap free — 250 credits for new users


Image Models Available Through AnyCap

Model Provider Best for
Seedream 5 ByteDance Highest quality first-pass. Product photography, hero images, detailed scenes.
GPT Image 2 OpenAI Native OpenAI ecosystem fit. Strong for UI screenshots and clean product shots.
FLUX.1 Kontext Max Black Forest Labs Design-heavy work, typography, graphic elements.
Nano Banana Pro Google Best for revision loops — generates quickly and holds edits well.
Nano Banana 2 Google Fast exploration. Use for volume and direction-testing before committing a final model.

Text-to-Image in Codex: Generate from a Prompt

The simplest case — describe what you need, get the image back:

anycap image generate \
  --prompt "a developer dashboard interface, dark theme, neon blue accent color, floating data cards, clean modern UI, product screenshot style" \
  --model seedream-5 \
  -o dashboard-hero.jpg

Model picker for Codex users:

Your Codex task Best model Why
Product screenshot, hero image Seedream 5 Best first-pass quality — Codex coded it, image should match the quality
UI mockup, design reference Nano Banana Pro Fast generation for iteration before committing the final visual
Social graphic, announcement GPT Image 2 OpenAI ecosystem fit — Codex + GPT Image 2 stays end-to-end in the OpenAI stack
Design-heavy, typographic FLUX.1 Kontext Max Handles graphic design elements better than photography-tuned models
Volume, fast exploration Nano Banana 2 When you need 5 directions fast before picking one

Image Editing in Codex: Modify an Existing Image

When you have an approved product screenshot or design asset and need to modify it — change the background, update text, adjust colors — without regenerating from scratch:

anycap image generate \
  --prompt "replace the background with a clean white studio background, keep the product interface exactly as-is" \
  --model nano-banana-pro \
  --mode edit \
  --param images=./dashboard-screenshot.jpg \
  -o dashboard-clean.jpg

When editing beats regeneration:

  • You have an approved product screenshot but need different backgrounds for different markets
  • You want to update text or labels in an existing graphic
  • You need multiple color variants of a finalized asset

The Full Codex Pipeline: Code → Image → Video → Publish

Codex chains shell commands naturally. AnyCap's CLI fits that pattern:

# 1. Codex builds the landing page
# ... (Codex's own work)

# 2. Generate the hero image (OpenAI-native: GPT Image 2)
anycap image generate \
  --prompt "product hero shot for a developer tool, dark background, code editor interface, neon accents" \
  --model gpt-image-2 \
  -o hero.jpg

# 3. Animate the hero into a motion teaser (OpenAI-native: Sora 2 Pro)
anycap video generate \
  --prompt "slow camera push-in, code highlights animate, subtle parallax background" \
  --model sora-2-pro \
  --mode image-to-video \
  --param images=./hero.jpg \
  -o teaser.mp4

# 4. Store and share
anycap drive upload hero.jpg teaser.mp4

Codex generated, animated, and stored — all OpenAI-native if you want, or mix providers by changing one flag.


Why Codex + AnyCap Is a Natural Fit

Three things make the AnyCap integration especially clean for Codex workflows:

1. CLI-native design. Codex executes shell commands. anycap image generate is just another shell command. No new paradigm. No API client to initialize. Codex chains it with && the same way it chains npm test or git push.

2. OpenAI ecosystem alignment. If your team is already OpenAI-first — Codex for code, GPT Image 2 for images, Sora 2 Pro for video — AnyCap routes all three through one CLI. But you can also mix: --model seedream-5 or --model flux-kontext-max when you want different output without adding a new API key.

3. Same command across agents. The install target changes (~/.codex/skills/ vs ~/.claude/skills/), but the command is identical:

anycap image generate --prompt "..." --model seedream-5 -o output.jpg

Same CLI. Same auth. Same models. Switch between Codex, Claude Code, and Cursor without reconfiguring.


Cross-Agent: Same Command, Different Agents

Agent Skill directory Unique advantage for image gen
Codex ~/.codex/skills/ CLI-native, OpenAI ecosystem alignment, seamless shell chaining
Claude Code ~/.claude/skills/ Subagent parallelism — compare multiple models simultaneously
Cursor ~/.cursor/skills/ In-IDE: generate, embed, and view images in one agent action

FAQ

Does Codex support image generation natively?

No. Codex is an agentic coding tool from OpenAI — it plans, implements, and ships code. Image generation requires external models. AnyCap bundles GPT Image 2, Seedream 5, FLUX.1, and Nano Banana behind one CLI.

Which image model should Codex users start with?

Seedream 5 for the highest quality first-pass on product images. GPT Image 2 if you want to stay fully in the OpenAI ecosystem (Codex → GPT Image 2 → Sora 2 Pro is a clean OpenAI-native pipeline). Nano Banana 2 for fast exploration when you need volume over perfection.

Can I use the same AnyCap install for image and video generation?

Yes. The same CLI handles both. anycap image generate and anycap video generate share the same auth, same credits, same output handling. The image-to-video pipeline is one workflow, not two separate tool setups.

Do I need separate API keys for different image models?

Not with AnyCap. One key covers GPT Image 2 (OpenAI), Seedream 5 (ByteDance), FLUX.1 (Black Forest Labs), and Nano Banana (Google). The runtime manages provider credentials internally.

Can Codex chain image generation with other shell commands?

Yes — Codex is built for this. npm run build && anycap image generate --prompt "..." -o hero.jpg && git add . && git commit -m "add hero". Codex thinks in shell pipelines. Image generation is just another step.

Can I use image generation in a Codex automation or CI pipeline?

Yes. AnyCap is headless — no UI required. Set your ANYCAP_API_KEY environment variable and call anycap image generate in any shell context where Codex runs automated tasks.


The Bottom Line

Codex plans features, writes code, runs tests, and ships. It can't make images — and that's by design.

The question is how you connect the two. A separate API key per provider and an integration script per model, or one CLI command that chains naturally into your existing Codex shell workflow.


Give Codex image generation — one install, all models




Written by the AnyCap team. We build the capability runtime that gives Codex image generation through one CLI — so your agent doesn't stop at "I can't create visuals."