[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-kimi-k27-code-highspeed-mode-skips-benchmarks-en":3,"article-related-kimi-k27-code-highspeed-mode-skips-benchmarks-en":30,"series-model-release-d18e6176-7ba9-4460-8230-425e3aeaeb86":77},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":22,"views":26,"created_at":27,"published_at":28,"topic_cluster_id":29},"d18e6176-7ba9-4460-8230-425e3aeaeb86","kimi-k27-code-highspeed-mode-skips-benchmarks-en","Kimi K2.7-Code Adds HighSpeed Mode, Skips Benchmarks","\u003Cp data-speakable=\"summary\">Moonshot’s Kimi K2.7-Code adds a faster mode and lower token use, but only Moonshot’s own benchmarks back the claims.\u003C\u002Fp>\u003Cp>\u003Ca href=\"\u002Ftag\u002Fmoonshot-ai\">Moonshot AI\u003C\u002Fa> released \u003Ca href=\"https:\u002F\u002Fkimi.com\" target=\"_blank\" rel=\"noopener\">Kimi K2.7-Code\u003C\u002Fa> on June 12, 2026, then pushed a \u003Ca href=\"https:\u002F\u002Fhuggingface.co\" target=\"_blank\" rel=\"noopener\">Hugging Face\u003C\u002Fa> rollout with a HighSpeed Mode on June 15. The pitch is simple: up to 6x faster throughput, about 30% fewer reasoning tokens, and a coding model that costs far less than the usual premium agent stack.\u003C\u002Fp>\u003Cp>What makes the release interesting is the gap between the marketing and the evidence. Moonshot has not submitted K2.7-Code to independent coding benchmarks, so developers are left with vendor-run numbers, a new pricing sheet, and a model that is already being compared with \u003Ca href=\"https:\u002F\u002Fopenai.com\" target=\"_blank\" rel=\"noopener\">OpenAI\u003C\u002Fa> and \u003Ca href=\"https:\u002F\u002Fwww.anthropic.com\" target=\"_blank\" rel=\"noopener\">Anthropic\u003C\u002Fa>.\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Metric\u003C\u002Fth>\u003Cth>Kimi K2.7-Code\u003C\u002Fth>\u003Cth>Why it matters\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Release date\u003C\u002Ftd>\u003Ctd>June 12, 2026\u003C\u002Ftd>\u003Ctd>Fresh model, fresh claims\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>HighSpeed Mode\u003C\u002Ftd>\u003Ctd>Up to 6x faster\u003C\u002Ftd>\u003Ctd>Useful for agentic coding loops\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Input pricing\u003C\u002Ftd>\u003Ctd>$0.95 per million tokens\u003C\u002Ftd>\u003Ctd>Low-cost API access\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Output pricing\u003C\u002Ftd>\u003Ctd>$4.00 per million tokens\u003C\u002Ftd>\u003Ctd>Still cheaper than many premium tools\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Context window\u003C\u002Ftd>\u003Ctd>262,144 tokens\u003C\u002Ftd>\u003Ctd>Fits long codebases and long sessions\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Active parameters\u003C\u002Ftd>\u003Ctd>About 32 billion per token\u003C\u002Ftd>\u003Ctd>Shows why MoE keeps inference cheaper\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>HighSpeed Mode changes the economics, not the story\u003C\u002Fh2>\u003Cp>Moonshot’s HighSpeed Mode is the most concrete part of the launch. The company says the faster variant reaches around 180 tokens per second on median coding inputs and up to 260 tokens per second on shorter-context tasks. That is a real improvement for teams running automated coding agents, where latency can decide whether a workflow feels usable or painfully slow.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781795890377-d0e8.png\" alt=\"Kimi K2.7-Code Adds HighSpeed Mode, Skips Benchmarks\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The speed boost matters because K2.7-Code is built for a very specific kind of work: long, repetitive, tool-heavy coding sessions. If a model can answer faster while keeping costs low, it becomes easier to use in batch repair jobs, repo-wide refactors, and command-line agents that need to keep moving.\u003C\u002Fp>\u003Cp>Moonshot also says the model can reduce reasoning token usage by roughly 30% compared with K2.6. That is a useful claim, but it is also the kind of claim that needs outside testing. Lower token use can mean cleaner reasoning, or it can mean the model stops thinking too early on harder tasks.\u003C\u002Fp>\u003Cblockquote>“The problem with generative AI is that it doesn’t know when to stop generating.” — \u003Ca href=\"https:\u002F\u002Fnews.mit.edu\u002F2024\u002Fwhy-large-language-models-overthink-1126\" target=\"_blank\" rel=\"noopener\">Dario Amodei\u003C\u002Fa>\u003C\u002Fblockquote>\u003Cp>That quote from \u003Ca href=\"\u002Ftag\u002Fanthropic\">Anthropic\u003C\u002Fa>’s co-founder fits this release well. Moonshot is betting that K2.7-Code thinks less wastefully than its predecessor, and that is exactly the kind of claim that looks good in a product post and needs a third-party benchmark to mean much in production.\u003C\u002Fp>\u003Ch2>Moonshot’s own numbers are doing all the work\u003C\u002Fh2>\u003Cp>Moonshot published five proprietary evaluations: Kimi Code Bench v2, Program Bench, MLS Bench Lite, MCP Atlas, and MCP Mark Verified. Those are useful as internal signals, but they are not the same thing as a public benchmark with an outside audit trail. As of June 15, no results had appeared on \u003Ca href=\"\u002Ftag\u002Fswe-bench-verified\">SWE-bench Verified\u003C\u002Fa>, DeepSWE, LiveCodeBench, or GPQA Diamond.\u003C\u002Fp>\u003Cp>That matters because benchmark inflation is a real problem in model selection. A vendor can tune a suite to its own strengths, then publish a score that looks impressive without proving much about day-to-day use. Independent tests are slower and less flattering, but they are the only way to know whether a model is good in the places teams actually care about.\u003C\u002Fp>\u003Cul>\u003Cli>K2.7-Code scored 62.0 on Kimi Code Bench v2.\u003C\u002Fli>\u003Cli>Moonshot compared that with GPT-5.5 at 69.0 and Claude Opus 4.8 at 67.4.\u003C\u002Fli>\u003Cli>Those competitor runs used different compute modes, including Codex xhigh and Claude Code xhigh.\u003C\u002Fli>\u003Cli>K2.6 previously topped OpenRouter’s weekly leaderboard in April 2026.\u003C\u002Fli>\u003C\u002Ful>\u003Cp>The comparison is useful, but it is not clean. If one model runs in a higher compute setting than another, the score is only partly about the model itself. That is why OpenRouter traffic data, which reflects real developer usage, often tells a more honest story than a lab score posted by the vendor.\u003C\u002Fp>\u003Ch2>The architecture explains the low price\u003C\u002Fh2>\u003Cp>K2.7-Code uses a Mixture-of-Experts design, or MoE, with a trillion total parameters split across 384 specialist experts. For each token, the router activates the top eight experts plus one shared expert. The rest stay idle. That is how Moonshot can price a very large model like a cheaper one at \u003Ca href=\"\u002Ftag\u002Finference\">inference\u003C\u002Fa> time.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781795889165-nv1c.png\" alt=\"Kimi K2.7-Code Adds HighSpeed Mode, Skips Benchmarks\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The attention system uses Multi-head Latent Attention, or MLA, which compresses the key-value cache and helps the model handle long contexts without blowing up memory use. That is the engineering reason the model can support a 256K context window and still be practical for API serving.\u003C\u002Fp>\u003Cul>\u003Cli>384 experts divide the model into specialized subnetworks.\u003C\u002Fli>\u003Cli>About 32 billion parameters are active per token.\u003C\u002Fli>\u003Cli>256K context is large enough for long repos, long chats, and multi-file refactors.\u003C\u002Fli>\u003Cli>The model always runs in thinking mode, with no instant-response path.\u003C\u002Fli>\u003C\u002Ful>\u003Cp>That last point matters more than it sounds. There is no quick, non-reasoning mode to fall back on, so every query pays the cost of deliberate inference. If the model truly uses fewer reasoning tokens on code tasks, that efficiency can offset the always-thinking design. If it does not, the price advantage may shrink fast under real workloads.\u003C\u002Fp>\u003Ch2>Moonshot is selling a product stack, not just weights\u003C\u002Fh2>\u003Cp>K2.7-Code is also the engine behind \u003Ca href=\"https:\u002F\u002Fkimi.com\" target=\"_blank\" rel=\"noopener\">Kimi Code\u003C\u002Fa>, Moonshot’s terminal-first coding agent. The subscription starts at $19 per month, the same rough tier that \u003Ca href=\"https:\u002F\u002Fwww.anthropic.com\u002Fclaude-code\" target=\"_blank\" rel=\"noopener\">Claude Code\u003C\u002Fa> uses for serious developer tooling. Moonshot is clearly aiming at the same buyer: teams that want a model, a CLI, and a predictable monthly bill.\u003C\u002Fp>\u003Cp>That strategy is smart because model quality alone rarely wins developer adoption. The winning package is usually a mix of latency, pricing, workflow fit, and enough trust to let the tool touch real code. Moonshot is trying to bundle all of that into one offer while keeping the underlying weights open enough for self-hosting.\u003C\u002Fp>\u003Cp>The company’s pace is also worth noting. K2 arrived in July 2025, K2 Thinking followed in November 2025, K2.5 landed in January 2026, K2.6 came in April 2026, and K2.7-Code arrived in \u003Ca href=\"\u002Fnews\u002Fdevin-pricing-june-2026-plans-limits-en\">June 2026\u003C\u002Fa>. That cadence is fast even by Chinese AI lab standards, and it helps explain why Moonshot keeps showing up in developer conversations.\u003C\u002Fp>\u003Cp>There is another layer here: trust and jurisdiction. Moonshot is based in Beijing, backed by investors including Alibaba, Tencent, China Mobile, and Meituan, and its API business runs through a Singapore entity. For enterprise buyers, that still leaves open the question of how data access and legal obligations work in practice.\u003C\u002Fp>\u003Cp>Moonshot has also had a public data incident before. The \u003Ca href=\"https:\u002F\u002Fincidentdatabase.ai\" target=\"_blank\" rel=\"noopener\">OECD AI Incident Database\u003C\u002Fa> recorded a case in April 2026 where Kimi disclosed one user’s private resume details to another user during a routine task. That does not prove a pattern, but it is the kind of event that security teams remember when they review a vendor.\u003C\u002Fp>\u003Ch2>What developers should do next\u003C\u002Fh2>\u003Cp>If you are evaluating K2.7-Code, the right question is not whether the launch sounds impressive. It is whether the model improves your own coding workflow enough to justify putting code, prompts, and context through Moonshot’s stack. For some teams, the answer may be yes, especially if they can self-host the open weights.\u003C\u002Fp>\u003Cp>For others, the safer path is to treat K2.7-Code as a candidate, not a decision. Run it on your own repos, compare it with \u003Ca href=\"https:\u002F\u002Fopenrouter.ai\" target=\"_blank\" rel=\"noopener\">OpenRouter\u003C\u002Fa> usage data where possible, and check whether the speed gains survive real tasks instead of demo prompts.\u003C\u002Fp>\u003Cp>The most likely near-term outcome is simple: K2.7-Code will attract attention because it is cheap, fast, and open enough to test, but the lack of independent benchmark submission will keep the model in the “interesting, not proven” bucket until outside results arrive. The next question is whether Moonshot wants to compete on trust as hard as it competes on throughput.\u003C\u002Fp>\u003Cp>If the company submits K2.7-Code to \u003Ca href=\"\u002Ftag\u002Fswe-bench\">SWE-bench\u003C\u002Fa> Verified or another widely respected suite, the conversation changes quickly. If it does not, developers will keep doing what they always do with vendor-only claims: they will compare notes, run their own tests, and wait for evidence that survives contact with production.\u003C\u002Fp>","Moonshot’s Kimi K2.7-Code adds a faster mode and lower token use, but only Moonshot’s own benchmarks back the claims.","www.techtimes.com","https:\u002F\u002Fwww.techtimes.com\u002Farticles\u002F318414\u002F20260615\u002Fkimi-k27-code-adds-highspeed-mode-skips-independent-benchmark-submission.htm",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781795890377-d0e8.png","model-release","en","a419fc45-bd6c-4ce2-a2ef-2a0467f6c02d",[17,18,19,20,21],"Kimi K2.7-Code","Moonshot AI","coding agent","benchmarking","Mixture-of-Experts",[23,24,25],"HighSpeed Mode promises up to 6x faster throughput and about 30% fewer reasoning tokens.","Moonshot has not submitted K2.7-Code to major independent coding benchmarks.","The model is priced aggressively and wrapped inside Kimi Code, a $19-per-month CLI agent.",0,"2026-06-18T15:17:41.403224+00:00","2026-06-18T15:17:41.391+00:00","1bae1133-d241-4581-9332-fbf39690c319",{"tags":31,"relatedLang":36,"relatedPosts":40},[32,34],{"name":19,"slug":33},"coding-agent",{"name":18,"slug":35},"moonshot-ai",{"id":15,"slug":37,"title":38,"language":39},"kimi-k27-code-highspeed-mode-skips-benchmarks-zh","Kimi K2.7-Code 主打快，但證據還不夠","zh",[41,47,53,59,65,71],{"id":42,"slug":43,"title":44,"cover_image":45,"image_url":45,"created_at":46,"category":13},"35368396-f604-46b7-9aa3-35ea227c99da","google-gemini-35-live-translate-audio-model-en","Google launches Gemini 3.5 Live Translate audio model","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781807575345-04yn.png","2026-06-18T18:32:29.328812+00:00",{"id":48,"slug":49,"title":50,"cover_image":51,"image_url":51,"created_at":52,"category":13},"952ab890-dacd-429b-93d2-3821a5dc00bc","kimi-k27-whats-new-and-how-to-run-it-en","Kimi K2.7: What Changed and How to Run It","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781791374284-re3e.png","2026-06-18T14:02:25.334347+00:00",{"id":54,"slug":55,"title":56,"cover_image":57,"image_url":57,"created_at":58,"category":13},"1b2d4bc4-b90d-4a10-a9e9-99f5e56a4719","linux-kernel-7-1-fred-ntfs-amd-fixes-en","Linux Kernel 7.1 adds FRED, NTFS, and AMD fixes","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781787775484-40k3.png","2026-06-18T13:02:25.315681+00:00",{"id":60,"slug":61,"title":62,"cover_image":63,"image_url":63,"created_at":64,"category":13},"4515e89e-6fbd-4dd8-a5fa-bbd2bcf6425a","fable-5-drew-rare-praise-ai-voices-en","Fable 5 drew rare praise from top AI voices","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781748174800-46t2.png","2026-06-18T02:02:31.145023+00:00",{"id":66,"slug":67,"title":68,"cover_image":69,"image_url":69,"created_at":70,"category":13},"a6017fa4-a339-4a83-b086-16a69dbde34d","devin-pricing-june-2026-plans-limits-en","Devin pricing in June 2026: plans, limits, tradeoffs","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781735574969-l960.png","2026-06-17T22:32:28.222997+00:00",{"id":72,"slug":73,"title":74,"cover_image":75,"image_url":75,"created_at":76,"category":13},"ccc46975-50d1-4ece-8fd3-c082bf4858ae","self-host-minimax-m3-gpu-cloud-en","Self-host MiniMax M3 on GPU cloud","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781716680837-ikof.png","2026-06-17T17:17:35.800599+00:00",[78,83,88,93,98,103,108,113,118,123],{"id":79,"slug":80,"title":81,"created_at":82},"d4cffde7-9b50-4cc7-bb68-8bc9e3b15477","nvidia-rubin-ai-supercomputer-en","NVIDIA Unveils Rubin: A Leap in AI Supercomputing","2026-03-25T16:24:35.155565+00:00",{"id":84,"slug":85,"title":86,"created_at":87},"eab919b9-fbac-4048-89fc-afad6749ccef","google-gemini-ai-innovations-2026-en","Google's AI Leap with Gemini Innovations in 2026","2026-03-25T16:27:18.841838+00:00",{"id":89,"slug":90,"title":91,"created_at":92},"5f5cfc67-3384-4816-a8f6-19e44d90113d","gap-google-gemini-ai-checkout-en","Gap Teams Up with Google Gemini for AI-Driven Checkout","2026-03-25T16:27:46.483272+00:00",{"id":94,"slug":95,"title":96,"created_at":97},"f6d04567-47f6-49ec-804c-52e61ab91225","ai-model-release-wave-march-2026-en","Navigating the AI Model Release Wave of March 2026","2026-03-25T16:28:45.409716+00:00",{"id":99,"slug":100,"title":101,"created_at":102},"895c150c-569e-4fdf-939d-dade785c990e","small-language-models-transform-ai-en","Small Language Models: Llama 3.2 and Phi-3 Transform AI","2026-03-25T16:30:26.688313+00:00",{"id":104,"slug":105,"title":106,"created_at":107},"38eb1d26-d961-4fd3-ae12-9c4089680f5f","midjourney-v8-alpha-features-pricing-en","Midjourney V8 Alpha: A Deep Dive into Its Features and Pricing","2026-03-26T01:25:36.387587+00:00",{"id":109,"slug":110,"title":111,"created_at":112},"bf36bb9e-3444-4fb8-ab19-0df6bc9d8271","rag-2026-indispensable-ai-bridge-en","RAG in 2026: The Indispensable AI Bridge","2026-03-26T01:28:34.472046+00:00",{"id":114,"slug":115,"title":116,"created_at":117},"60881d6d-2310-44ef-b1fb-7f98e9dd2f0e","xiaomi-mimo-trio-agents-robots-voice-en","Xiaomi’s MiMo trio targets agents, robots, and voice","2026-03-28T03:05:08.899895+00:00",{"id":119,"slug":120,"title":121,"created_at":122},"f063d8d1-41d1-4de4-8ebc-6c40511b9369","xiaomi-mimo-v2-pro-1t-moe-agents-en","Xiaomi MiMo-V2-Pro: 1T MoE Model for Agents","2026-03-28T03:06:19.238032+00:00",{"id":124,"slug":125,"title":126,"created_at":127},"a1379e9a-6785-4ff5-9b0a-8cff55f8264f","cursor-composer-2-started-from-kimi-en","Cursor’s Composer 2 started from Kimi","2026-03-28T03:11:59.132398+00:00"]