[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-cursor-composer-2-agentic-coding-model-zh":3,"tags-cursor-composer-2-agentic-coding-model-zh":32,"related-lang-cursor-composer-2-agentic-coding-model-zh":50,"related-posts-cursor-composer-2-agentic-coding-model-zh":54,"series-model-release-c4b6186f-bd84-4598-997e-c6e31d543c0d":91},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":10,"keywords":11,"language":20,"translated_content":10,"views":21,"is_premium":22,"created_at":23,"updated_at":23,"cover_image":24,"published_at":23,"rewrite_status":25,"rewrite_error":10,"rewritten_from_id":26,"slug":27,"category":28,"related_article_id":29,"status":30,"google_indexed_at":31,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":10,"topic_cluster_id":10,"embedding":10,"is_canonical_seed":22},"c4b6186f-bd84-4598-997e-c6e31d543c0d","Cursor Composer 2 走向代理式寫碼","\u003Cp>Cursor 推出 \u003Ca href=\"https:\u002F\u002Fcursor.com\" target=\"_blank\" rel=\"noopener\">Cursor\u003C\u002Fa> 的 \u003Ca href=\"https:\u002F\u002Fcursor.com\u002Fblog\u002Fcomposer-2\" target=\"_blank\" rel=\"noopener\">Composer 2\u003C\u002Fa>。它在 CursorBench 拿到 61.3，Terminal-Bench 2.0 拿到 61.7。這不是聊天玩具。它是要進 IDE 幫你改檔、跑測試、繼續做下去。\u003C\u002Fp>\u003Cp>講白了，AI 寫碼工具正在換檔。以前大家看重補全。現在大家看重 agent。能不能自己拆任務、動多個檔案、把 PR 推進到可合併，這才是重點。對工程團隊來說，少切幾次視窗，往往比多講幾句廢話更值錢。\u003C\u002Fp>\u003Cp>我覺得這波很現實。買單的人不在乎模型會不會寫詩。他們在乎每週能關幾個 PR，還有 token 帳單會不會炸掉。Composer 2 就是衝著這種需求來的。\u003C\u002Fp>\u003Ch2>Cursor 這次到底端了什麼\u003C\u002Fh2>\u003Cp>\u003Ca href=\"https:\u002F\u002Fcursor.com\u002Fblog\u002Fcomposer-2\" target=\"_blank\" rel=\"noopener\">Cursor\u003C\u002Fa> 在 2026 年 3 月 19 日發表 Composer 2。公司母體是 \u003Ca href=\"https:\u002F\u002Fanysphere.co\" target=\"_blank\" rel=\"noopener\">Anysphere\u003C\u002Fa>。它的定位很直接：這是給開發流程用的模型，不是萬用聊天機器人。\u003C\u002Fp>\u003Cp>它可以看程式碼、改多個檔案、呼叫工具，還能在長任務裡持續工作。這種能力很重要。因為真實專案裡，常常不是寫一段函式就結束。你還要補測試、修 lint、看 CI log，然後再修一次。\u003C\u002Fp>\u003Cp>Cursor 公布的重點數字很清楚。它把 Composer 2 放在自己的 editor 裡測，這點也很關鍵。因為 Cursor 不是只賣 API。它直接握著工作流程，能看到模型在真實開發場景裡怎麼死、怎麼活。\u003C\u002Fp>\u003Cul>\u003Cli>CursorBench：61.3\u003C\u002Fli>\u003Cli>Terminal-Bench 2.0：61.7\u003C\u002Fli>\u003Cli>SWE-bench Multilingual：73.7\u003C\u002Fli>\u003Cli>標準價格：每 1,000 input tokens 收 $0.50\u003C\u002Fli>\u003Cli>標準價格：每 1,000 output tokens 收 $2.50\u003C\u002Fli>\u003Cli>Fast 版：吞吐更高，但價格是 5 倍\u003C\u002Fli>\u003C\u002Ful>\u003Cp>這些數字代表的意思不難懂。Composer 2 不是想當全能型助理。它想當一個會做事的 coding worker。能在 repo 裡跑，能在 terminal 裡查，能在多步驟任務裡不亂掉，這才是它的賣點。\u003C\u002Fp>\u003Ch2>為什麼架構會影響體感\u003C\u002Fh2>\u003Cp>Cursor 說 Composer 2 延續了 \u003Ca href=\"https:\u002F\u002Fen.wikipedia.org\u002Fwiki\u002FMixture_of_experts\" target=\"_blank\" rel=\"noopener\">mixture-of-experts\u003C\u002Fa> 架構。意思是，不是每次都把全部參數打開。模型會把工作路由到少數 expert。這樣做，算力用得更省，回應也能更快。\u003C\u002Fp>\u003Cp>這件事對 agentic coding 很重要。因為 agent 不是只吐一句答案。它要讀檔、推理、改 patch、看 log、再 retry。每一步都慢，開發者就會開始罵人。每一步都夠快，體感就像旁邊多了一個 junior engineer。\u003C\u002Fp>\u003Cp>Cursor 也提到，它用 sandboxed coding 環境做了 reinforcement learning。簡單說，就是把模型丟進像真的開發任務裡，訓練它怎麼用工具、怎麼動檔案、怎麼面對失敗的測試。這比單純拿網頁文字做訓練，實用很多。\u003C\u002Fp>\u003Cul>\u003Cli>MoE 讓每個 token 不必動用全部參數\u003C\u002Fli>\u003Cli>Sandbox 訓練強化工具使用能力\u003C\u002Fli>\u003Cli>長任務需要模型記住前後文脈絡\u003C\u002Fli>\u003Cli>IDE 整合讓模型直接碰 terminal 和 worktree\u003C\u002Fli>\u003C\u002Ful>\u003Cp>這也是為什麼我會把 Composer 2 跟一般聊天模型分開看。寫碼 agent 常死在很無聊的地方。它可能改錯檔，忘記前面說過的限制，或是做一半就停。能在 repo 工作流裡訓練過的模型，至少比較懂這些坑。\u003C\u002Fp>\u003Ch2>分數、價格、還有大家最在意的比較\u003C\u002Fh2>\u003Cp>Cursor 的說法很明白。Composer 2 在 CursorBench 比 Composer 1.5 高 38%。Terminal-Bench 2.0 則拿到 61.7。對常常要跑多輪修 bug 的團隊來說，這種分數不是裝飾品。它會直接影響你要不要續訂。\u003C\u002Fp>\u003Cp>價格也很有意思。標準版是每 1,000 input tokens 收 $0.50，每 1,000 output tokens 收 $2.50。這個定價把它放在不少 frontier 模型之下。對高用量團隊來說，這種差距會很有感。因為寫碼 agent 很容易吃 token，尤其是大型 repo。\u003C\u002Fp>\u003Cp>Cursor 也有 Fast 版。它的吞吐更高，但價格是 5 倍。這很像拿錢換時間。你如果在趕版號，可能會想開。你如果只是做一般 refactor，標準版可能比較合理。\u003C\u002Fp>\u003Cul>\u003Cli>Composer 2 標準版：$0.50 \u002F 1,000 input tokens\u003C\u002Fli>\u003Cli>Composer 2 標準版：$2.50 \u002F 1,000 output tokens\u003C\u002Fli>\u003Cli>Composer 2 Fast：吞吐更高，價格 5 倍\u003C\u002Fli>\u003Cli>Composer 1.5：分數較低，長任務能力較弱\u003C\u002Fli>\u003Cli>GPT-5、Claude Opus 級模型：通常泛用推理更強，但成本也更高\u003C\u002Fli>\u003C\u002Ful>\u003Cp>但我得吐槽一下。benchmark 再漂亮，也不等於真實開發現場就贏。Cursor 沒把每次跑分的 seed、硬體、完整流程全公開。這不代表分數沒用。只是你不能直接把它當成最後答案。\u003C\u002Fp>\u003Cblockquote>“The model is only as good as the workflow around it.” — Andrej Karpathy, X post, 2023\u003C\u002Fblockquote>\u003Cp>Karpathy 這句話很適合拿來看 Composer 2。模型本身很重要。可是真正決定體感的，還有 editor、terminal、權限、review 流程。Cursor 的優勢，就是它把這些東西綁在一起。\u003C\u002Fp>\u003Ch2>企業為什麼會盯上它\u003C\u002Fh2>\u003Cp>Cursor 不是只在小圈子裡玩。它已經進到不少工程團隊裡。這代表 Composer 2 不是只要在 demo 裡會講話。它要在真實公司裡交作業。\u003C\u002Fp>\u003Cp>Tom’s Hardware 報導，\u003Ca href=\"https:\u002F\u002Fwww.nvidia.com\" target=\"_blank\" rel=\"noopener\">NVIDIA\u003C\u002Fa> 內部有超過 30,000 個 Cursor 席位。公司也提過，程式碼產出比起 AI 之前的基準，已經變成 3 倍。這種數字，採購跟主管都會看。\u003C\u002Fp>\u003Cp>企業會在意的東西很務實。像 audit logs、sandboxed terminals、isolated worktrees、commit signing，這些都不是花拳繡腿。這些是讓 agent 能進公司流程的門票。沒有這些，很多法遵團隊根本不會點頭。\u003C\u002Fp>\u003Cul>\u003Cli>NVIDIA 內部超過 30,000 個 Cursor 席位\u003C\u002Fli>\u003Cli>公司宣稱程式碼產出達到 3 倍\u003C\u002Fli>\u003Cli>Audit logs 方便追查修改紀錄\u003C\u002Fli>\u003Cli>Sandboxed execution 降低危險操作外溢\u003C\u002Fli>\u003C\u002Ful>\u003Cp>但企業買不買，最後還是看結果。最難的不是寫出一段 patch。最難的是處理 flaky CI、半套 migration、還有那種靠 side effect 活著的老舊 codebase。這種環境，才是 agent 的照妖鏡。\u003C\u002Fp>\u003Ch2>產業脈絡沒有那麼浪漫\u003C\u002Fh2>\u003Cp>AI coding 工具這兩年變得很擠。\u003Ca href=\"https:\u002F\u002Fopenai.com\" target=\"_blank\" rel=\"noopener\">OpenAI\u003C\u002Fa>、\u003Ca href=\"https:\u002F\u002Fwww.anthropic.com\" target=\"_blank\" rel=\"noopener\">Anthropic\u003C\u002Fa>、\u003Ca href=\"https:\u002F\u002Fdeepmind.google\" target=\"_blank\" rel=\"noopener\">Google DeepMind\u003C\u002Fa> 都在往這裡壓。大家都知道，光會聊天不夠。要能動手做事，才有機會留在工作流裡。\u003C\u002Fp>\u003Cp>這也是為什麼 Cursor 的策略很聰明。它不是只賣模型。它賣的是整個寫碼介面。模型、editor、terminal、worktree、review，一起包進去。這種整合，讓它比純 API 供應商更容易觀察使用情境。\u003C\u002Fp>\u003Cp>不過，這條路也很吃驗證。外部團隊還是會想看獨立測試。尤其是同樣任務下，誰的完成率高，誰的 token 花費低，誰的延遲短。這些才是工程主管會拿來算帳的數字。\u003C\u002Fp>\u003Cp>我的判斷很直接。Composer 2 不是來跟聊天機器人比嘴砲。它是來搶「幫你把任務做完」的位置。這個位置很值錢，也很難守。因為只要模型在真實 repo 裡出一次包，信任就會掉得很快。\u003C\u002Fp>\u003Ch2>接下來該看什麼\u003C\u002Fh2>\u003Cp>我會先看兩件事。第一，第三方能不能重跑出接近的分數。第二，實際團隊用起來，token 成本是不是真的壓得住。這兩件事，比 launch thread 的聲量重要太多。\u003C\u002Fp>\u003Cp>如果你是台灣的工程團隊，我建議先拿一個非核心 repo 試。挑一個有多檔案修改、測試、跟簡單重構的任務。看它能不能自己走完。再看它的失敗率、重試次數、和每個 merged change 的成本。這比看簡報準多了。\u003C\u002Fp>\u003Cp>我的預測很簡單。接下來 6 到 12 個月，寫碼 agent 會從「幫你補字」變成「幫你收尾」。誰能把收尾做穩，誰就比較有機會留在 IDE 裡。你如果現在就在評估工具，別只看模型分數。直接跑一個小型 pilot，答案會比較誠實。\u003C\u002Fp>","Cursor 推出 Composer 2，CursorBench 61.3、Terminal-Bench 2.0 61.7，主打代理式寫碼與高量產團隊的成本效率。","www.aicerts.ai","https:\u002F\u002Fwww.aicerts.ai\u002Fnews\u002Fcursor-composer-2-frontier-agentic-coding-model-debuts\u002F",null,[12,13,14,15,16,17,18,19],"Cursor","Composer 2","agentic coding","AI寫碼","Terminal-Bench 2.0","CursorBench","LLM","IDE","zh",1,false,"2026-03-28T03:13:06.422716+00:00","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1774497597106-o12v.png","done","6093c232-81d6-4784-9a20-7787b92b433e","cursor-composer-2-agentic-coding-model-zh","model-release","d23cd5f6-f875-49f5-b53b-1c5416d13d99","published","2026-04-09T09:00:59.067+00:00",[33,35,37,39,41,43,46,48],{"name":16,"slug":34},"terminal-bench-2-0",{"name":12,"slug":36},"cursor",{"name":17,"slug":38},"cursorbench",{"name":18,"slug":40},"llm",{"name":19,"slug":42},"ide",{"name":44,"slug":45},"Terminal Bench 2.0","terminal-bench-20",{"name":14,"slug":47},"agentic-coding",{"name":15,"slug":49},"ai寫碼",{"id":29,"slug":51,"title":52,"language":53},"cursor-composer-2-agentic-coding-model-en","Cursor Composer 2 Bets on Agentic Coding","en",[55,61,67,73,79,85],{"id":56,"slug":57,"title":58,"cover_image":59,"image_url":59,"created_at":60,"category":28},"5b5fa24f-5259-4e9e-8270-b08b6805f281","minimax-m1-open-hybrid-attention-reasoning-model-zh","MiniMax-M1：開源 1M Token 推理模型","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778797859209-ea1g.png","2026-05-14T22:30:38.636592+00:00",{"id":62,"slug":63,"title":64,"cover_image":65,"image_url":65,"created_at":66,"category":28},"b1da56ac-8019-4c6b-a8dc-22e6e22b1cb5","gemini-omni-video-review-text-rendering-zh","Gemini Omni 影片模型怎麼了","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778779280109-lrrk.png","2026-05-14T17:20:42.608312+00:00",{"id":68,"slug":69,"title":70,"cover_image":71,"image_url":71,"created_at":72,"category":28},"d63e9d93-e613-4bbf-8135-9599fde11d08","why-xiaomi-mimo-v25-pro-changes-coding-agents-zh","為什麼 Xiaomi 的 MiMo-V2.5-Pro 改變的是 Coding …","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778689858139-v38e.png","2026-05-13T16:30:27.893951+00:00",{"id":74,"slug":75,"title":76,"cover_image":77,"image_url":77,"created_at":78,"category":28},"8f0c9185-52f9-46f2-82c6-5baec126ba2e","openai-realtime-audio-models-live-voice-zh","OpenAI 即時音訊模型瞄準語音互動","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778451657895-2iu7.png","2026-05-10T22:20:32.443798+00:00",{"id":80,"slug":81,"title":82,"cover_image":83,"image_url":83,"created_at":84,"category":28},"52106dc2-4eba-4ca0-8318-fa646064de97","anthropic-10-finance-ai-agents-zh","Anthropic推10款金融AI Agent","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778389843399-vclb.png","2026-05-10T05:10:22.778762+00:00",{"id":86,"slug":87,"title":88,"cover_image":89,"image_url":89,"created_at":90,"category":28},"6ee6ed2a-35c6-4be3-ba2c-43847e592179","why-claudes-infinite-context-window-wont-autonomous-zh","為什麼 Claude 的「無限」上下文窗口，仍然不會讓 AI 自主運作","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778350250836-d5d5.png","2026-05-09T18:10:27.004984+00:00",[92,97,102,107,112,117,122,127,128,133],{"id":93,"slug":94,"title":95,"created_at":96},"58b64033-7eb6-49b9-9aab-01cf8ae1b2f2","nvidia-rubin-six-chips-one-ai-supercomputer-zh","NVIDIA Rubin 把六顆晶片塞進 AI 機櫃","2026-03-26T07:18:45.861277+00:00",{"id":98,"slug":99,"title":100,"created_at":101},"0dcc2c61-c2a6-480d-adb8-dd225fc68914","march-2026-ai-model-news-what-mattered-zh","2026 年 3 月 AI 模型新聞重點","2026-03-26T07:32:08.386348+00:00",{"id":103,"slug":104,"title":105,"created_at":106},"214ab08b-5ce5-4b5c-8b72-47619d8675dd","why-small-models-are-winning-on-device-ai-zh","小模型為何吃下裝置端 AI","2026-03-26T07:36:30.488966+00:00",{"id":108,"slug":109,"title":110,"created_at":111},"785624b2-0355-4b82-adc3-de5e45eecd88","midjourney-v8-faster-images-higher-costs-zh","Midjourney V8 變快了，也變貴了","2026-03-26T07:52:03.562971+00:00",{"id":113,"slug":114,"title":115,"created_at":116},"cda76b92-d209-4134-86c1-a60f5bc7b128","xiaomi-mimo-trio-agents-robots-voice-zh","小米 MiMo 三模型瞄準代理、機器人與語音","2026-03-28T03:05:08.779489+00:00",{"id":118,"slug":119,"title":120,"created_at":121},"9e1044b4-946d-47fe-9e2a-c2ee032e1164","xiaomi-mimo-v2-pro-1t-moe-agents-zh","小米 MiMo-V2-Pro 登場：1T MoE 模型","2026-03-28T03:06:19.002353+00:00",{"id":123,"slug":124,"title":125,"created_at":126},"d68e59a2-55eb-4a8f-95d6-edc8fcbff581","cursor-composer-2-started-from-kimi-zh","Cursor Composer 2 其實從 Kimi 起步","2026-03-28T03:11:58.893796+00:00",{"id":4,"slug":27,"title":5,"created_at":23},{"id":129,"slug":130,"title":131,"created_at":132},"45812c46-99fc-4b1f-aae1-56f64f5c9024","openai-shuts-down-sora-video-app-api-zh","OpenAI 關閉 Sora App 與 API","2026-03-29T04:47:48.974108+00:00",{"id":134,"slug":135,"title":136,"created_at":137},"e112e76f-ec3b-408f-810e-e93ae21a888a","apple-siri-gemini-distilled-models-zh","Apple Siri 牽手 Gemini 的真相","2026-03-29T04:52:57.886544+00:00"]