DeepSeek V4 Pro Gets a Permanent 75% Price Cut

May 22, 2026 — DeepSeek has permanently slashed the API pricing for its flagship V4 Pro model to a quarter of the original price. Cache-hit input pricing across the entire lineup also dropped to one-tenth.

New Pricing

Per the official DeepSeek API docs, here's the updated V4 Pro pricing (per million tokens):

Item	Old Price	New Price (USD)	New Price (CNY)
Input (cache hit)	$0.0145	$0.003625	≈¥0.025
Input (cache miss)	$1.74	$0.435	≈¥2.96
Output	$3.48	$0.87	≈¥5.92

For comparison, the lighter V4 Flash model costs: cache-hit input $0.0028 (≈¥0.02), cache-miss input $0.14 (≈¥0.95), output $0.28 (≈¥1.91).

What's Behind the Cut

DeepSeek V4 Pro launched in April 2026. It's a Mixture-of-Experts model with 1.6 trillion total parameters and 49 billion active per task — currently the largest open-weight model available, surpassing Moonshot AI's Kimi K2.6 and MiniMax's M1.

The price cut isn't a temporary promotion. The official pricing page now shows the 75%-off rates as the standard price, with the old prices crossed out for reference.

DeepSeek also reduced cache-hit input pricing to one-tenth across all models. Cache hits kick in when you send requests similar to previous ones — the system returns cached results at a fraction of the compute cost. For production developers hitting the API frequently, this is the most direct cost benefit.

Market Context

This is an aggressive move in an already heated AI price war. GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro are all competing in the several-dollars-per-million-tokens range. V4 Pro goes under a dollar.

The Hacker News thread was active. One developer called V4 Pro "really really competent" for everyday coding work — not quite at Opus level, but close enough that the price difference makes it a compelling choice. Another user crunched the numbers: at V4 Pro's new pricing, you'd need to push roughly 1 billion output tokens per day to justify self-hosting on rented GPUs. Most people won't come close.

Ecosystem

V4 Pro works natively with Claude Code, OpenCode, and OpenClaw. Switch base_url in your OpenAI-compatible client and you're good to go. Anthropic API format is also supported.

Specs: 1M token context window, 384K max output tokens, tool calling, JSON output, and thinking/reasoning mode.

Why Now?

The timing is worth noting. Days earlier, White House OSTP Director Michael Kratsios accused foreign entities (pointing at China) of running "industrial-scale" distillation campaigns against US AI models. DeepSeek had previously been called out by both Anthropic and OpenAI for the same thing.

DeepSeek's answer? Cut prices. More direct than a press release — and arguably more effective.

As one HN commenter put it: "A few days ago we were hearing about how the 'free lunch is over,' now we're seeing discounts and increased usage limits."

Bottom Line

For developers, it means this: a model that competes with frontier models is now available at less than a third of the cost of its closest rivals. If you've been weighing the options, the math just got a lot simpler.