Cursor Composer 2.5: Built on Kimi K2.5, Performance Rivals Frontier Models

Cursor Composer 2.5: Challenging Frontier Models at 1/10 the Cost

On May 18, Cursor released Composer 2.5 — the next generation model powering their AI coding assistant. The upgrade arrives soon after Composer 2 and brings significant improvements.

Cursor partnered with SpaceXAI (xAI's new brand) to train on Colossus 2 — a cluster with a million H100-equivalents. But Composer 2.5 itself is based on Moonshot's open-source Kimi K2.5 checkpoint, not trained from scratch.

Model performance comparison

Cursor claims Composer 2.5 delivers near-frontier performance at 1/10 the cost. Here's how it stacks up:

Benchmark comparison

Model	Relative Cost	Coding	Long Tasks
Composer 2.5	★	★★★★	★★★★
Claude Opus 4.7	★★★★★	★★★★★	★★★★★
GPT-4o	★★★	★★★★	★★★
Kimi K2.5 (base)	★★	★★★★	★★★

But community feedback tells a different story. HN users pointed out Composer 2 had similar benchmark claims — and real-world performance fell short.

Pricing

Composer 2.5: Included in Cursor Pro $20/month
Claude Opus 4.7: ~$15/$75 per 1M tokens (API)
Gemini 3.5 Flash: $0.75/$4.50 per 1M tokens

Technical highlights

Targeted RL with textual feedback. Rather than scoring entire rollouts, Composer 2.5 inserts hints at the exact point where the model messed up. This gives localized training signals for specific behaviors.

25x more synthetic tasks. As the model improves, existing training problems become too easy. They auto-generate harder tasks through feature deletion, code refactoring, and other techniques.

Behavioral optimization. Beyond intelligence, they improved communication style and effort calibration — dimensions not captured by benchmarks but crucial for daily use.

Next step: training from scratch

Cursor revealed they're training a significantly larger model from scratch on Colossus 2 with 10x more compute. This signals they're not content with fine-tuning open-source checkpoints — they want their own flagship model.

Reception

HN 277 points, 39 comments. Feedback was split:

Supporters see it as the best value option for daily high-frequency use
Skeptics were burned by Composer 2's "great benchmarks, mediocre reality" and remain cautious
Some questioned the Opus 4.7 comparison — why not Sonnet?

Overall, Composer 2.5 strikes a solid balance between cost and capability. For Cursor users, this upgrade is worth trying. But don't expect it to fully replace Claude Opus or GPT-4o for core development — that might have to wait for their custom model.

Cursor Composer 2.5: Challenging Frontier Models at 1/10 the Cost

On May 18, Cursor released Composer 2.5 — the next generation model powering their AI coding assistant. The upgrade arrives soon after Composer 2 and brings significant improvements.

Model performance comparison

Cursor claims Composer 2.5 delivers near-frontier performance at 1/10 the cost. Here's how it stacks up:

Benchmark comparison

Model

Relative Cost

Coding

Long Tasks

Composer 2.5

★

★★★★

Claude Opus 4.7

★★★★★

GPT-4o

★★★

★★★★

★★★

Kimi K2.5 (base)

★★

★★★★

★★★

But community feedback tells a different story. HN users pointed out Composer 2 had similar benchmark claims — and real-world performance fell short.

Pricing

Composer 2.5: Included in Cursor Pro $20/month

Claude Opus 4.7: ~$15/$75 per 1M tokens (API)

Gemini 3.5 Flash: $0.75/$4.50 per 1M tokens

Technical highlights

25x more synthetic tasks. As the model improves, existing training problems become too easy. They auto-generate harder tasks through feature deletion, code refactoring, and other techniques.

Behavioral optimization. Beyond intelligence, they improved communication style and effort calibration — dimensions not captured by benchmarks but crucial for daily use.

Reception

HN 277 points, 39 comments. Feedback was split:

Supporters see it as the best value option for daily high-frequency use

Skeptics were burned by Composer 2's "great benchmarks, mediocre reality" and remain cautious

Some questioned the Opus 4.7 comparison — why not Sonnet?

Cursor Composer 2.5: Built on Kimi K2.5, Performance Rivals Frontier Models

More articles

2026-07-09 Picks: Alibaba Bailian, Chanmama, Baidu AgentBuilder

Kimi K2.7 Code Released: Agent Workflow Rivals Opus 4.8 | 2026-07-09

2026-07-08 Picks: Pulpie, Karakeep, OfficeCLI

Slopo, Memora, deptrust: Three AI-Era Developer Tools Worth Trying | 2026-07-06

Cursor Composer 2.5: Built on Kimi K2.5, Performance Rivals Frontier Models

Cursor Composer 2.5: Challenging Frontier Models at 1/10 the Cost

Model performance comparison

Benchmark comparison

Pricing

Technical highlights

Next step: training from scratch

Reception

More articles

2026-07-09 Picks: Alibaba Bailian, Chanmama, Baidu AgentBuilder

Kimi K2.7 Code Released: Agent Workflow Rivals Opus 4.8 | 2026-07-09

2026-07-08 Picks: Pulpie, Karakeep, OfficeCLI

Slopo, Memora, deptrust: Three AI-Era Developer Tools Worth Trying | 2026-07-06

Cursor Composer 2.5: Challenging Frontier Models at 1/10 the Cost

Model performance comparison

Benchmark comparison

Pricing

Technical highlights

Next step: training from scratch

Reception