[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-gpt-5-5-senior-engineer-benchmark-every-en":3,"article-related-gpt-5-5-senior-engineer-benchmark-every-en":30,"series-model-release-d1a3f7e9-4415-4158-afbc-1327e7148fb3":81},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":22,"views":26,"created_at":27,"published_at":28,"topic_cluster_id":29},"d1a3f7e9-4415-4158-afbc-1327e7148fb3","gpt-5-5-senior-engineer-benchmark-every-en","GPT-5.5 scores 62.5 on Every’s engineer test","\u003Cp data-speakable=\"summary\">Every says GPT-5.5 is \u003Ca href=\"\u002Ftag\u002Fopenai\">OpenAI\u003C\u002Fa>’s fastest new work model and tops its Senior Engineer Benchmark.\u003C\u002Fp>\u003Cp>OpenAI released GPT-5.5 on April 23, 2026, and Every says the model hit 62.5 on its best run on the publication’s Senior Engineer Benchmark. That put it well ahead of \u003Ca href=\"\u002Ftag\u002Fopus-47\">Opus 4.7\u003C\u002Fa> in the low 30s, though still below human senior engineers, who score in the high 80s and low 90s.\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>項目\u003C\u002Fth>\u003Cth>數值\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Release date\u003C\u002Ftd>\u003Ctd>April 23, 2026\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Best Senior Engineer Benchmark score\u003C\u002Ftd>\u003Ctd>62.5\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Opus 4.7 comparison score\u003C\u002Ftd>\u003Ctd>Low 30s\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Human senior engineer range\u003C\u002Ftd>\u003Ctd>High 80s to low 90s\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Context window\u003C\u002Ftd>\u003Ctd>1 million tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Input pricing\u003C\u002Ftd>\u003Ctd>$5 per 1M tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Output pricing\u003C\u002Ftd>\u003Ctd>$30 per 1M tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>GPT-5.5 Pro output pricing\u003C\u002Ftd>\u003Ctd>$180 per 1M tokens\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>What changed\u003C\u002Fh2>\u003Cp>Every’s review frames GPT-5.5 as a new pre-train, not just a better wrapper around the same base model. The result, according to the piece, is a model that feels faster, steadier, and easier to work with than \u003Ca href=\"https:\u002F\u002Fwww.anthropic.com\u002F\" target=\"_blank\" rel=\"noopener\">Anthropic\u003C\u002Fa>’s Opus 4.7 for many professional tasks.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779538549724-ut61.png\" alt=\"GPT-5.5 scores 62.5 on Every’s engineer test\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The article says GPT-5.5 launches first in \u003Ca href=\"https:\u002F\u002Fopenai.com\u002Fchatgpt\u002F\" target=\"_blank\" rel=\"noopener\">ChatGPT\u003C\u002Fa> and \u003Ca href=\"\u002Ftag\u002Fcodex\">Codex\u003C\u002Fa>, with API access coming later after more safety and security checks. It also keeps a 1 million-token context window, supports prompt caching, and defaults to medium reasoning instead of none.\u003C\u002Fp>\u003Cul>\u003Cli>Best benchmark run: 62.5 on Every’s Senior Engineer Benchmark\u003C\u002Fli>\u003Cli>Opus 4.7: low 30s at a similar reasoning level\u003C\u002Fli>\u003Cli>Human senior engineers: high 80s to low 90s\u003C\u002Fli>\u003Cli>API pricing: $5 in, $30 out per 1 million tokens\u003C\u002Fli>\u003Cli>GPT-5.5 Pro pricing: $30 in, $180 out per 1 million tokens\u003C\u002Fli>\u003Cli>Launch surface: ChatGPT and Codex first, API later\u003C\u002Fli>\u003C\u002Ful>\u003Cp>Every also says GPT-5.5 is better at sustained engineering, writing, dashboards, curricula, run-of-show docs, and transcript-based work. But it still trails Opus 4.7 on some product and design tasks, plus Ruby, PowerPoint, and spatial composition.\u003C\u002Fp>\u003Ch2>Why it matters\u003C\u002Fh2>\u003Cp>The practical shift is less about a single benchmark win and more about where OpenAI wants to compete. Every says GPT-5.5 is OpenAI’s clearest bid to reclaim coding and professional work, areas where \u003Ca href=\"\u002Ftag\u002Fanthropic\">Anthropic\u003C\u002Fa> has been the default for many teams.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779538548449-e06u.png\" alt=\"GPT-5.5 scores 62.5 on Every’s engineer test\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>For developers, the pitch is simple: fewer retries, more planning, and a model you can keep in the loop on long tasks. If that holds up in production, GPT-5.5 could become the cheaper-to-finish option even when its token price is higher than GPT-5.4.\u003C\u002Fp>\u003Cp>The bigger question is whether speed and reliability will outweigh Opus 4.7’s edge in planning, product taste, and presentation work. For now, Every’s take is that GPT-5.5 is the safer daily driver for code and knowledge work, while Opus still has the sharper creative finish.\u003C\u002Fp>\u003Cp>The takeaway: GPT-5.5 looks like OpenAI’s strongest move yet to turn \u003Ca href=\"\u002Ftag\u002Fchatgpt\">ChatGPT\u003C\u002Fa> into a work model, but the real test is whether teams trust it on unfinished, messy jobs.\u003C\u002Fp>","Every says GPT-5.5 beat Opus 4.7 on its Senior Engineer Benchmark, scoring 62.5 on its best run and landing as OpenAI’s work model.","every.to","https:\u002F\u002Fevery.to\u002Fvibe-check\u002Fgpt-5-5",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779538549724-ut61.png","model-release","en","e461ae3e-ed3f-4109-910c-8ebac13936bd",[17,18,19,20,21],"GPT-5.5","OpenAI","benchmark","coding","Anthropic",[23,24,25],"GPT-5.5 scored 62.5 on Every’s Senior Engineer Benchmark, ahead of Opus 4.7’s low-30s result.","OpenAI is positioning GPT-5.5 as a work model for coding and knowledge tasks, with ChatGPT and Codex first.","The model keeps a 1 million-token context window and launches with pricing above GPT-5.4.",1,"2026-05-23T12:15:27.068447+00:00","2026-05-23T12:15:27.059+00:00","1bae1133-d241-4581-9332-fbf39690c319",{"tags":31,"relatedLang":40,"relatedPosts":44},[32,33,35,37,38],{"name":20,"slug":20},{"name":18,"slug":34},"openai",{"name":17,"slug":36},"gpt-55",{"name":19,"slug":19},{"name":21,"slug":39},"anthropic",{"id":15,"slug":41,"title":42,"language":43},"gpt-5-5-senior-engineer-benchmark-every-en-zh","GPT-5.5 在工程測試拿 62.5 分","zh",[45,51,57,63,69,75],{"id":46,"slug":47,"title":48,"cover_image":49,"image_url":49,"created_at":50,"category":13},"58aa41ca-2c5f-44c6-ab07-2002473e95b1","gemini-1-5-pro-002-flash-002-2-0-flash-update-en","Gemini 1.5 Pro-002, Flash-002 and 2.0 Flash update Google AI","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780999383257-jccn.png","2026-06-09T10:02:28.362637+00:00",{"id":52,"slug":53,"title":54,"cover_image":55,"image_url":55,"created_at":56,"category":13},"435fc551-a461-444a-bf95-dbf5685cfac0","minimax-m3-open-weight-coding-win-en","MiniMax M3 Proves Open-Weight Can Still Win on Coding","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780968781159-odhi.png","2026-06-09T01:32:31.256895+00:00",{"id":58,"slug":59,"title":60,"cover_image":61,"image_url":61,"created_at":62,"category":13},"12af5a0d-1bbf-4a50-a391-b53f8003f234","gemini-35-flash-pricing-benchmarks-en","Gemini 3.5 Flash Pricing, Context, Benchmarks","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780840981235-e7hm.png","2026-06-07T14:02:30.280485+00:00",{"id":64,"slug":65,"title":66,"cover_image":67,"image_url":67,"created_at":68,"category":13},"0e767e9d-5d17-4cd0-b6ee-0328f89eb49b","gemma-4-12b-specs-benchmarks-run-locally-en","Gemma 4 12B: Specs, Benchmarks & How to Run It Locally","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780777984661-5ymr.png","2026-06-06T20:32:25.294996+00:00",{"id":70,"slug":71,"title":72,"cover_image":73,"image_url":73,"created_at":74,"category":13},"9d15f962-739d-44f8-a7f9-11bca64d38e0","best-kimi-models-2026-k2-5-vs-k2-thinking-en","Best Kimi Models in 2026: K2.5 vs K2 Thinking","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780770786284-shy0.png","2026-06-06T18:32:39.779504+00:00",{"id":76,"slug":77,"title":78,"cover_image":79,"image_url":79,"created_at":80,"category":13},"34547376-5d6b-4453-8d80-8072d8ac36ed","kimi-k2-6-open-source-coding-agent-swarm-en","Kimi K2.6 adds open-source coding and agent swarm","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780761781526-wop4.png","2026-06-06T16:02:22.26883+00:00",[82,87,92,97,102,107,112,117,122,127],{"id":83,"slug":84,"title":85,"created_at":86},"d4cffde7-9b50-4cc7-bb68-8bc9e3b15477","nvidia-rubin-ai-supercomputer-en","NVIDIA Unveils Rubin: A Leap in AI Supercomputing","2026-03-25T16:24:35.155565+00:00",{"id":88,"slug":89,"title":90,"created_at":91},"eab919b9-fbac-4048-89fc-afad6749ccef","google-gemini-ai-innovations-2026-en","Google's AI Leap with Gemini Innovations in 2026","2026-03-25T16:27:18.841838+00:00",{"id":93,"slug":94,"title":95,"created_at":96},"5f5cfc67-3384-4816-a8f6-19e44d90113d","gap-google-gemini-ai-checkout-en","Gap Teams Up with Google Gemini for AI-Driven Checkout","2026-03-25T16:27:46.483272+00:00",{"id":98,"slug":99,"title":100,"created_at":101},"f6d04567-47f6-49ec-804c-52e61ab91225","ai-model-release-wave-march-2026-en","Navigating the AI Model Release Wave of March 2026","2026-03-25T16:28:45.409716+00:00",{"id":103,"slug":104,"title":105,"created_at":106},"895c150c-569e-4fdf-939d-dade785c990e","small-language-models-transform-ai-en","Small Language Models: Llama 3.2 and Phi-3 Transform AI","2026-03-25T16:30:26.688313+00:00",{"id":108,"slug":109,"title":110,"created_at":111},"38eb1d26-d961-4fd3-ae12-9c4089680f5f","midjourney-v8-alpha-features-pricing-en","Midjourney V8 Alpha: A Deep Dive into Its Features and Pricing","2026-03-26T01:25:36.387587+00:00",{"id":113,"slug":114,"title":115,"created_at":116},"bf36bb9e-3444-4fb8-ab19-0df6bc9d8271","rag-2026-indispensable-ai-bridge-en","RAG in 2026: The Indispensable AI Bridge","2026-03-26T01:28:34.472046+00:00",{"id":118,"slug":119,"title":120,"created_at":121},"60881d6d-2310-44ef-b1fb-7f98e9dd2f0e","xiaomi-mimo-trio-agents-robots-voice-en","Xiaomi’s MiMo trio targets agents, robots, and voice","2026-03-28T03:05:08.899895+00:00",{"id":123,"slug":124,"title":125,"created_at":126},"f063d8d1-41d1-4de4-8ebc-6c40511b9369","xiaomi-mimo-v2-pro-1t-moe-agents-en","Xiaomi MiMo-V2-Pro: 1T MoE Model for Agents","2026-03-28T03:06:19.238032+00:00",{"id":128,"slug":129,"title":130,"created_at":131},"a1379e9a-6785-4ff5-9b0a-8cff55f8264f","cursor-composer-2-started-from-kimi-en","Cursor’s Composer 2 started from Kimi","2026-03-28T03:11:59.132398+00:00"]