DeepSeek V4 Release Date: Everything We Know Right Now

DeepSeek V4 full release is expected late April 2026. V4 Lite is live. Complete timeline, architecture specs, and what developers should prepare for before the weights drop.

by AnyCap

DeepSeek V4 Release Date: Everything We Know Right Now

DeepSeek V4 has been the most anticipated open-source model of 2026 — and after months of missed windows, a surprise V4 Lite appearance, and a steady stream of leaks, the full release is now looking imminent. Here is the current state of everything confirmed, everything rumored, and what developers should prepare for before the weights drop.


Current Status: V4 Lite Is Live, Full V4 Is Close

As of April 2026, here is where things stand:

Milestone Status Date
V4 Lite appears on DeepSeek website ✅ Confirmed March 9, 2026
First leaked benchmarks (HumanEval 90%, SWE-bench 80%+) Unverified Late February 2026
Architecture papers published (Engram, MoE specs) ✅ Confirmed Dec 2025 – Jan 2026
Full DeepSeek V4 official launch 🔜 Imminent Late April 2026 (expected)
Apache 2.0 weights release Planned With or shortly after launch

The appearance of "V4 Lite" — a ~200B parameter variant — on DeepSeek's website on March 9 is the clearest signal that the V4 family is real and in staged rollout. Multiple credible reports point to a late April 2026 window for the full 1-trillion-parameter model. As of today (April 24), we are in that window.


Why It Has Taken This Long

The original speculation placed DeepSeek V4 in late January or early February 2026. Three consecutive windows passed without a release. Here is what likely caused the delays:

Training on non-Nvidia hardware. DeepSeek V4 was trained on Huawei Ascend 910B and Cambricon MLU chips rather than Nvidia H800s. This is partly by necessity — US export restrictions limit Chinese lab access to the most powerful Nvidia GPUs — but it is also a proof of concept for domestic Chinese AI silicon. Training a trillion-parameter model on hardware without the mature ecosystem of CUDA and decades of Nvidia tooling introduces novel engineering challenges.

Benchmark targets not met on first runs. The gap between V3's ~49% SWE-bench score and V4's claimed 80%+ is extraordinary for a single model generation. If early training runs did not hit internal performance targets, DeepSeek would have iterated rather than released.

Regulatory and geopolitical timing. Major AI releases from Chinese labs attract scrutiny. The timing of a trillion-parameter open-source model with capabilities rivaling GPT-5.4 and Claude Opus 4.6 is a decision that goes beyond pure technical readiness.


What We Know About the Architecture

The architecture is the most thoroughly documented aspect of DeepSeek V4, thanks to three papers published between December 2025 and January 2026.

Scale: ~1 trillion total parameters with approximately 37 billion active per token (Mixture-of-Experts). This is the same active parameter count as DeepSeek V3, which means inference costs remain manageable at trillion-parameter scale.

Engram Memory: DeepSeek's conditional memory system, designed to solve the retrieval degradation problem in long-context models. Standard attention at million-token scale achieves around 84% Needle-in-a-Haystack accuracy. Engram claims 97% — meaning the model reliably finds specific information buried in million-token contexts rather than just technically accepting long inputs.

Context Window: 1 million tokens. Practical only if retrieval quality holds (see Engram above).

Native Multimodal: Text, image, and video generation integrated during pre-training, not added as adapters after the fact.

Hardware: Huawei Ascend 910B + Cambricon MLU. No Nvidia.

License: Apache 2.0 — commercial use permitted, no copyleft, patent grant included.


The Release Timeline (What Has and Has Not Happened)

Date Event
Late January 2026 First V4 rumors on Chinese tech forums
Mid-February 2026 First speculated release window passes
Late February 2026 Lunar New Year window passes; API outage sparks speculation
Early March 2026 Financial Times reports V4 release "imminent"
March 9, 2026 V4 Lite (~200B params) appears on DeepSeek's website
Late April 2026 Current expected window for full V4

The V4 Lite appearance is significant beyond confirming the architecture exists. It follows a pattern DeepSeek used with V3: release a smaller, more accessible variant first, then follow with the full model. If V4 Lite is the ~200B equivalent, the full 1T model is likely in final preparation.


What to Expect on Launch Day

Weights on Hugging Face. DeepSeek has released all previous models on Hugging Face. V4 is expected to follow the same pattern under the Apache 2.0 license.

API access. DeepSeek's API typically launches with or shortly after weight release. Based on current DeepSeek pricing trajectories, expect around $0.30/MTok for standard inference — a fraction of what GPT-5.5 ($5/$30) or Claude Sonnet 4.6 ($3/$15) cost.

Benchmark flood. Within hours of weight release, the community will run independent evaluations. The leaked internal benchmarks (90% HumanEval, 80%+ SWE-bench) will face scrutiny. Historical pattern: independent evaluations typically confirm 70–90% of internal claims.

Quantized builds. The community will produce INT8 and INT4 quantizations within days. Target hardware: 2× RTX 4090 (INT8) or 1× RTX 5090 (INT4).


Why This Release Window Matters More Than Usual

The timing of DeepSeek V4's release lands in the middle of the most competitive frontier model period ever. GPT-5.5 just launched (April 23). Claude Mythos (93.9% SWE-bench) was announced in April. Gemini 3.1 Pro is leading on GPQA Diamond at 94.3%.

An open-source model with 80%+ SWE-bench and Apache 2.0 weights changes the structure of this competition. It doesn't compete with Claude Mythos at the very top of the capability curve — but it offers a self-hostable option that costs hardware rather than API fees, with no data-sharing requirements and full fine-tuning freedom.

For developers at startups and mid-sized companies, that trade-off is often worth more than a few percentage points on GPQA Diamond.


DeepSeek V4 and AnyCap: Two Ways to Use It

When DeepSeek V4 launches, you will have two primary ways to access it:

Direct API (deepseek.com): Lowest cost, single-model access, straightforward for DeepSeek-specific deployments.

Self-hosted: Maximum control, no per-token costs after hardware, best for sensitive data or high-volume workloads. Requires GPU infrastructure.

Through AnyCap: When V4 is integrated into the AnyCap platform, you get a third option — a unified API that can route between DeepSeek V4, GPT-5.5, Claude, and Gemini based on task type, cost, and latency requirements. AnyCap also adds media generation capabilities (image, video) that DeepSeek V4's API won't surface directly even with its native multimodal architecture.

Factor DeepSeek V4 API Self-Hosted AnyCap
Cost at scale ~$0.30/MTok Hardware only Depends on routing
Model variety DeepSeek only DeepSeek only Multi-model
Image/video generation API-dependent Via separate pipeline ✅ Built-in
Data privacy Sent to DeepSeek API ✅ On-premise Configurable
Infrastructure overhead None High None
Fine-tuning Not via API ✅ Full access Via self-hosted
# When DeepSeek V4 is available on AnyCap
anycap run \
  --model deepseek-v4 \
  --task "Analyze this codebase and generate a refactoring plan"

# Or pair reasoning with media generation in one workflow
anycap image generate \
  --prompt "Architecture diagram for the refactoring plan" \
  --model nano-banana-2 \
  -o refactor-diagram.png

The combination is especially relevant because DeepSeek V4's native video and image generation capabilities will likely take time to surface through official APIs — the architecture supports multimodal, but API access follows more slowly than raw model capability.


How to Prepare Right Now

Set up your eval suite. Don't wait for launch day to think about what you're testing. Define the tasks that matter for your application now, so you can evaluate V4 weights within hours of release.

Assess self-hosting feasibility. If you have GPU infrastructure (or can spin up 2× H100s in a cloud environment), V4's Apache 2.0 license makes self-hosting legally clean. Map out the compute before you need it.

Watch DeepSeek's official channels. Previous releases have had limited advance warning. The announcement pattern is typically short-notice — Hugging Face, the DeepSeek website, and their official accounts tend to post simultaneously.

Don't reorganize your stack preemptively. The benchmark claims are unverified until independent evaluation. The model may be extraordinary or it may fall short on specific tasks. Evaluate first, migrate second.


What Happens After DeepSeek V4

If V4 delivers on its specs, the next phase is fine-tuning. Apache 2.0 means the developer community will immediately start producing specialized variants: V4-code, V4-instruct, V4-math. This downstream ecosystem is part of what makes an open-source release different from a closed-source API — the model continues improving through community iteration.

For developers, this means the value of integrating DeepSeek V4 extends beyond the base release. A purpose-fine-tuned V4-code variant trained on your domain may outperform the base model on your specific tasks in ways that closed-source alternatives cannot match.


DeepSeek V4 Complete Developer GuideCompare AI Inference PlatformsImage Generation via AnyCap