Tag

BenchLM

3 articles

Why coding benchmarks are finally telling the truth

Research/May 13

Why coding benchmarks are finally telling the truth

BenchLM’s coding leaderboard says LiveCodeBench and SWE-bench Pro are the only signals that still matter.

Kimi K2.6 Scores: BenchLM’s 2026 Breakdown

Model Releases/May 4

Kimi K2.6 Scores: BenchLM’s 2026 Breakdown

Kimi K2.6 ranks #12 overall on BenchLM, with strong coding and agentic scores, plus a 256K context window and open weights.

GPT-5.4 Scores 97.6 in Knowledge Benchmarks

Model Releases/Apr 13

GPT-5.4 Scores 97.6 in Knowledge Benchmarks

GPT-5.4 tops knowledge benchmarks with 97.6, ranks #2 overall on BenchLM, and posts a 1.05M-token context window.