[IND] 8 min readOraCore Editors

Cursor’s new coding model rides on Kimi K2.5

Cursor’s Composer 2 is cheaper than Claude Opus 4.6, but it was built on Kimi K2.5 and that changes the story.

Share LinkedIn
Cursor’s new coding model rides on Kimi K2.5

Cursor says its new coding model, Composer 2, costs $0.50 per million input tokens and $2.50 per million output tokens. That is far below Anthropic’s Claude Opus 4.6 at $5.00 and $25.00, and below OpenAI’s GPT-5.4 at $2.50 and $15.00. The price gap is real, but the bigger story is where Cursor started: on top of Moonshot AI’s open-source Kimi K2.5.

That detail matters because Cursor initially presented Composer 2 like a homegrown model built from scratch. Later, Cursor employee Lee Robinson said roughly a quarter of the pretraining came from the base model, with Cursor doing the rest through fine-tuning and continued training. In other words, Composer 2 is a specialized derivative, not a clean-sheet model.

Cursor’s own numbers make the model look strong. On its internal CursorBench, Composer 2 scores 61.3, up from 44.2 for Composer 1.5. That puts it close to Claude Opus 4.6 at 58.2 and behind GPT-5.4 Thinking at 63.9. For a company whose product lives and dies on code quality, that is a meaningful jump.

What Cursor actually shipped

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

Composer 2 is a code-only model. Cursor co-founder Aman Sanger told Bloomberg the model was trained exclusively on code data, which explains why it is aimed squarely at software work instead of general chat. That narrow scope also helps keep the model smaller and cheaper to run.

The model ships in Cursor and in the early alpha of the new Glass interface. Cursor also offers a faster variant, Composer 2 Fast, which the company says has the same intelligence but lower latency. The standard version is priced at $0.50 and $2.50 per million tokens, while Composer 2 Fast costs $1.50 and $7.50.

For a coding assistant, those numbers are the whole business. Cursor is not trying to win a general-purpose chatbot race. It is trying to make agentic coding cheap enough that customers will use it often, and often enough that the product can hold onto margin.

  • Composer 2: $0.50 input / $2.50 output per million tokens
  • Composer 2 Fast: $1.50 input / $7.50 output per million tokens
  • Claude Opus 4.6: $5.00 input / $25.00 output per million tokens
  • GPT-5.4: $2.50 input / $15.00 output per million tokens

The Kimi K2.5 part Cursor did not lead with

The new wrinkle arrived after Cursor’s launch. Lee Robinson said the model was built on Kimi K2.5, with about a quarter of the pretraining coming from that base. Cursor then added fine-tuning and continued training, which changed the benchmark results enough that Composer 2 no longer matches Kimi K2.5 directly.

Cursor did not disclose that base model in its original announcement, and that omission drew criticism. Cursor co-founder Aman Sanger later admitted the miss, saying: “It was a miss to not mention the Kimi base in our blog from the start. We'll fix that for the next model.”

“It was a miss to not mention the Kimi base in our blog from the start. We'll fix that for the next model.” — Aman Sanger, co-founder of Cursor

That quote matters because it reframes the launch. The issue is not that Cursor used an open model. Fine-tuning an open base model is normal and often smart. The issue is that Cursor’s marketing made the model sound more original than it was. In AI, that distinction affects trust fast.

There is also a practical angle here. Cursor’s model runs through Fireworks, which handles inference under a commercial license. So the company is not just using open-source weights; it is also relying on a partner for the serving layer. That is a perfectly valid stack, but it is a stack, not a secret lab breakthrough.

Why Cursor needs its own model

Cursor is in a strange spot. It sells an AI coding editor while depending on the same model vendors it competes against. That means Cursor has to buy from Anthropic and OpenAI even as those companies sell directly to the same developers Cursor wants as customers.

Bloomberg reported that Cursor has more than one million daily users and around 50,000 enterprise customers. It is also reportedly talking about a new funding round at a valuation near $50 billion. That is a lot of growth, but it does not erase the structural problem: if your product depends on someone else’s model, your pricing power is limited.

Anthropic has been especially aggressive in coding with Claude Code. Cursor reportedly estimates that a single $200-per-month Claude Code subscription can generate around $5,000 in compute costs. That is the kind of math that makes platform dependence painful.

  • CursorBench: Composer 2 scores 61.3, up from 44.2 for Composer 1.5
  • Terminal Bench 2.0: Composer 2 scores 61.7, compared with 47.9 for Composer 1.5
  • SWE-bench Multilingual: Composer 2 scores 73.7, up from 65.9 for Composer 1.5
  • Claude Opus 4.6: 58.0 on Claude Code’s Terminal Bench setup, 65.4 in an optimized Anthropic result, 77.8 on SWE-bench Multilingual

Those benchmark numbers are useful, but they need context. Cursor notes that Terminal Bench results depend on the agent, harness, and settings, so direct comparisons can get messy. Even so, the trend is clear: Composer 2 is good enough to matter, and cheap enough to put pressure on rivals.

What this says about AI coding right now

Cursor’s move is a reminder that the most practical AI products may come from companies that are excellent at product design, workflow, and distribution rather than giant pretraining runs. If a team can take an open base model, train it hard on code, and ship a model that gets close to the best proprietary systems, that says something about where value is actually created.

It also says something uncomfortable about model branding. If a startup can get close to frontier performance by starting from a strong open model, then the gap between “our model” and “our fine-tuned fork” gets thinner than the marketing usually suggests. That does not make the work less impressive. It makes the honesty more important.

Cursor would probably have been better off owning the open-source angle from day one. A clear message like “we built a specialized coding model on top of Kimi K2.5 and tuned it for agentic programming” would have been easier to defend than a delayed correction. It would also have put pressure on the big labs in a smarter way: if a smaller company can get this far with targeted training, how much extra value are buyers really getting from a proprietary base model?

For now, Composer 2 looks like a strong product move and a messy communications story. The product answer is easy: cheaper coding models win developers. The business answer is harder: Cursor still needs to prove it can keep control of its own stack without depending on the same companies it is trying to outcompete.

Cursor’s next test is honesty, not just benchmarks

The interesting question is not whether Cursor can ship another good coding model. It already proved it can improve benchmark scores with targeted training and a cheaper token bill. The real test is whether it can be explicit about what its models are built on, because developers care about provenance almost as much as performance.

If Cursor’s next release clearly labels the base model, the training recipe, and the serving partner, it will look more credible than it did this time. If it keeps the details vague, the company risks turning a technical win into a trust problem. In AI coding, that is a costly trade.

My bet: the next major Cursor model will be marketed as an open-base, code-specialized system from the start. If that happens, this episode becomes a footnote. If it does not, every benchmark win will come with the same question attached: what exactly did Cursor build, and what did it inherit?