[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-swe-bench":3},{"tag":4,"articles":11},{"id":5,"name":6,"slug":7,"article_count":8,"description_zh":9,"description_en":10},"7f2f1f94-2fd6-4985-9136-b9715dbf8f06","SWE-Bench","swe-bench",11,"SWE-bench 是用真實 GitHub issue 評估程式修復能力的基準，常分成 Verified、Lite 等版本。它反映模型與 agent 是否能讀懂程式庫、定位 bug、修改測試並維持可重現性，也常被用來比較 coding agent 的成本與效率。","SWE-bench is a benchmark for measuring whether models and coding agents can fix real GitHub issues end to end. Its variants, including Verified and Lite, are used to compare bug localization, test-driven edits, and the cost of agentic repair workflows.",[12,21,29,37],{"id":13,"slug":14,"title":15,"summary":16,"category":17,"image_url":18,"cover_image":18,"language":19,"created_at":20},"21805270-d3b7-4155-8e3f-2c650cef3315","tested-devin-10-tasks-finished-3-zh","我測了 Devin 10 個任務，只做完 3 個","Devin 在 SWE-bench 只拿 13.86%，實測 10 個真實任務也只完成 3 個。這篇拆解它在哪些工作能用、哪些地方會亂掉。","ai-agent","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775167981590-g2tr.png","zh","2026-04-02T22:12:37.165364+00:00",{"id":22,"slug":23,"title":24,"summary":25,"category":26,"image_url":27,"cover_image":27,"language":19,"created_at":28},"c679b51f-194a-463b-87fc-7695256ff752","mimo-v2-pro-vs-omni-vs-flash-2026-zh","MiMo V2 Pro、Omni、Flash 怎麼選","MiMo 2026 三款模型分工很清楚：Flash 主打開源與 coding，Pro 提供 1M context，Omni 則處理圖像、音訊與影片。這篇直接比 benchmark、價格與適用場景。","model-release","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775092816164-mhzf.png","2026-04-02T01:18:43.576128+00:00",{"id":30,"slug":31,"title":32,"summary":33,"category":26,"image_url":34,"cover_image":35,"language":19,"created_at":36},"9e1044b4-946d-47fe-9e2a-c2ee032e1164","xiaomi-mimo-v2-pro-1t-moe-agents-zh","小米 MiMo-V2-Pro 登場：1T MoE 模型","小米推出 MiMo-V2-Pro，總參數超過 1T、每 token 啟用 42B，還有 1M context。SWE-bench 成績逼近 Claude Sonnet 4.6，價格卻低很多。",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1774597397882-x4i9.png","2026-03-28T03:06:19.002353+00:00",{"id":38,"slug":39,"title":40,"summary":41,"category":26,"image_url":34,"cover_image":42,"language":19,"created_at":43},"cda76b92-d209-4134-86c1-a60f5bc7b128","xiaomi-mimo-trio-agents-robots-voice-zh","小米 MiMo 三模型瞄準代理、機器人與語音","小米一次推出三款 MiMo AI 模型，涵蓋代理、多模態與語音。MiMo-V2-Pro 以超過 1 兆參數、100 萬 token 上下文，逼近 Claude Opus 4.6 的表現。","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1774498800835-3s4y.png","2026-03-28T03:05:08.779489+00:00"]