GPT-5.5 scores 62.5 on Every’s engineer test
Every says GPT-5.5 beat Opus 4.7 on its Senior Engineer Benchmark, scoring 62.5 on its best run and landing as OpenAI’s work model.

Every says GPT-5.5 is OpenAI’s fastest new work model and tops its Senior Engineer Benchmark.
OpenAI released GPT-5.5 on April 23, 2026, and Every says the model hit 62.5 on its best run on the publication’s Senior Engineer Benchmark. That put it well ahead of Opus 4.7 in the low 30s, though still below human senior engineers, who score in the high 80s and low 90s.
| 項目 | 數值 |
|---|---|
| Release date | April 23, 2026 |
| Best Senior Engineer Benchmark score | 62.5 |
| Opus 4.7 comparison score | Low 30s |
| Human senior engineer range | High 80s to low 90s |
| Context window | 1 million tokens |
| Input pricing | $5 per 1M tokens |
| Output pricing | $30 per 1M tokens |
| GPT-5.5 Pro output pricing | $180 per 1M tokens |
What changed
Get the latest AI news in your inbox
Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.
No spam. Unsubscribe at any time.
Every’s review frames GPT-5.5 as a new pre-train, not just a better wrapper around the same base model. The result, according to the piece, is a model that feels faster, steadier, and easier to work with than Anthropic’s Opus 4.7 for many professional tasks.

The article says GPT-5.5 launches first in ChatGPT and Codex, with API access coming later after more safety and security checks. It also keeps a 1 million-token context window, supports prompt caching, and defaults to medium reasoning instead of none.
- Best benchmark run: 62.5 on Every’s Senior Engineer Benchmark
- Opus 4.7: low 30s at a similar reasoning level
- Human senior engineers: high 80s to low 90s
- API pricing: $5 in, $30 out per 1 million tokens
- GPT-5.5 Pro pricing: $30 in, $180 out per 1 million tokens
- Launch surface: ChatGPT and Codex first, API later
Every also says GPT-5.5 is better at sustained engineering, writing, dashboards, curricula, run-of-show docs, and transcript-based work. But it still trails Opus 4.7 on some product and design tasks, plus Ruby, PowerPoint, and spatial composition.
Why it matters
The practical shift is less about a single benchmark win and more about where OpenAI wants to compete. Every says GPT-5.5 is OpenAI’s clearest bid to reclaim coding and professional work, areas where Anthropic has been the default for many teams.

For developers, the pitch is simple: fewer retries, more planning, and a model you can keep in the loop on long tasks. If that holds up in production, GPT-5.5 could become the cheaper-to-finish option even when its token price is higher than GPT-5.4.
The bigger question is whether speed and reliability will outweigh Opus 4.7’s edge in planning, product taste, and presentation work. For now, Every’s take is that GPT-5.5 is the safer daily driver for code and knowledge work, while Opus still has the sharper creative finish.
The takeaway: GPT-5.5 looks like OpenAI’s strongest move yet to turn ChatGPT into a work model, but the real test is whether teams trust it on unfinished, messy jobs.
// Related Articles
- [MODEL]
Gemini 1.5 Pro-002, Flash-002 and 2.0 Flash update Google AI
- [MODEL]
MiniMax M3 Proves Open-Weight Can Still Win on Coding
- [MODEL]
Gemini 3.5 Flash Pricing, Context, Benchmarks
- [MODEL]
Gemma 4 12B: Specs, Benchmarks & How to Run It Locally
- [MODEL]
Best Kimi Models in 2026: K2.5 vs K2 Thinking
- [MODEL]
Kimi K2.6 adds open-source coding and agent swarm