Tag

SWE-Bench Pro

4 articles

MiniMax M3 Proves Open-Weight Can Still Win on Coding

MiniMax M3 makes a strong case that open-weight models can still lead on coding, context, and price.

Kimi K2.6 is the open-weight coding model that matches GPT-5.5 on SWE-Bench Pro at far lower cost.

BenchLM’s coding leaderboard says LiveCodeBench and SWE-bench Pro are the only signals that still matter.

Marginlab’s daily tracker watches Claude Code Opus 4.6 on 50 SWE-Bench-Pro tasks and flags statistically significant drops.