[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-code-becomes-the-agent-harness-zh":3,"article-related-code-becomes-the-agent-harness-zh":37,"series-research-adfa9b15-68b6-44cc-b34d-ebcb02c31210":87},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":19,"translated_content":10,"views":20,"is_premium":21,"created_at":22,"updated_at":22,"cover_image":11,"published_at":23,"rewrite_status":24,"rewrite_error":10,"rewritten_from_id":25,"slug":26,"category":27,"related_article_id":28,"status":29,"google_indexed_at":30,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":31,"topic_cluster_id":35,"embedding":36,"is_canonical_seed":21},"adfa9b15-68b6-44cc-b34d-ebcb02c31210","程式碼成了代理引擎","\u003Cp data-speakable=\"summary\">這篇綜述把程式碼定位成代理系統的運行層，串起推理、動作、記憶與驗證。\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>研究機構\u003C\u002Fstrong>：arXiv 摘要未明確標註\u003C\u002Fli>\u003Cli>\u003Cstrong>核心數據\u003C\u002Fstrong>：摘要無公開 benchmark 數字\u003C\u002Fli>\u003Cli>\u003Cstrong>突破點\u003C\u002Fstrong>：把程式碼當代理底座\u003C\u002Fli>\u003C\u002Ful>\u003Cp>大型語言模型會寫程式，這件事大家已經不陌生。但這篇綜述要講的，不是模型又多會寫幾題，而是程式碼在 agentic 系統裡，開始變成「運行層」本身。它不只是輸出結果，而是把推理、行動、環境建模、執行驗證接起來的那層骨架。\u003C\u002Fp>\u003Cp>這個角度很實際。因為一個代理系統好不好，不再只是看模型下一個 \u003Ca href=\"\u002Ftag\u002Ftoken\">token\u003C\u002Fa> 準不準。真正影響體驗的，還有外面那圈 harness：怎麼規劃步驟、怎麼存狀態、怎麼呼叫工具、怎麼檢查結果、怎麼跨步驟或跨代理協作。\u003C\u002Fp>\u003Ch2>這篇論文想解什麼痛點\u003C\u002Fh2>\u003Cp>作者先從一個很簡單的觀察出發：現代 \u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa> 已經能在很多程式任務上表現不錯，從競賽程式到 repository-level 軟體工程都涵蓋在內。但當這些模型被拿來做 \u003Ca href=\"\u002Ftag\u002Fagent\">agent\u003C\u002Fa> 時，程式碼就不再只是被產出的物件，而是讓系統真的能運作的底層材料。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779173040130-zcyg.png\" alt=\"程式碼成了代理引擎\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>問題來了。當程式碼同時扮演「產物」和「基礎設施」兩種角色時，工程師很容易在概念上切得太散。規劃是規劃，記憶是記憶，工具是工具，驗證是驗證，看起來彼此獨立，但實作上其實都被同一層 code infrastructure 綁在一起。\u003C\u002Fp>\u003Cp>這篇綜述就是要補這個空缺。它提出一個「agent harness」的框架，幫大家用更清楚的方式看待以程式碼為核心的 agent 系統。白話講，就是不要再把 agent 的周邊能力當零碎外掛，而是把它們看成同一個運行框架裡的不同模組。\u003C\u002Fp>\u003Ch2>它的方法到底怎麼運作\u003C\u002Fh2>\u003Cp>這不是新模型，也不是\u003Ca href=\"\u002Fnews\u002Frrfp-readiness-driven-pipeline-training-zh\">訓練\u003C\u002Fa> recipe，所以沒有傳統論文那種 architecture 圖和 loss function。它的貢獻是整理領域，提出一個結構化的思考方式，把 code-as-harness 系統拆成三層。\u003C\u002Fp>\u003Cp>第一層是 harness interface。這一層處理程式碼怎麼連到 agent 的推理、動作與環境建模。實作上，這會影響 agent 怎麼表達步驟、怎麼呼叫操作、怎麼表示它正在互動的世界狀態。\u003C\u002Fp>\u003Cp>第二層是 harness mechanisms。作者把重點放在長程執行所需的規劃、記憶、工具使用，以及回饋驅動的控制與最佳化。這層的目標不是把 agent 做得花俏，而是讓它在多步驟任務裡維持穩定，不要一遇到偏差就整個崩掉。\u003C\u002Fp>\u003Cp>\u003Ca href=\"\u002Fnews\u002Fwhy-wembanyama-game-3-should-change-spurs-expectations-zh\">第三\u003C\u002Fa>層是從單代理擴展到多代理系統。到了這個層級，共享的程式碼物件可以拿來做協調、審查和驗證。這對需要多個 worker 一起合作的系統很重要，因為大家不只要各自會做事，還要能對齊狀態、檢查彼此輸出、分工處理不同責任。\u003C\u002Fp>\u003Cp>合在一起看，這三層其實在講同一件事：程式碼不是 agent 行為的副產品，而是讓行為能被執行、被檢查、被回復的操作面。這也是這篇綜述最核心的觀點。\u003C\u002Fp>\u003Ch2>這篇實際證明了什麼\u003C\u002Fh2>\u003Cp>先講清楚，這是綜述，不是實驗論文。摘要沒有公開新的 benchmark、沒有模型釋出、也沒有對照實驗數字可引用。若你想找的是某個指標提升多少，這篇摘要沒有提供完整 benchmark 細節。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779173034419-dsfs.png\" alt=\"程式碼成了代理引擎\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>但它不是空談。作者整理了一批代表性方法與應用，範圍涵蓋 coding assistant、GUI 與作業系統自動化、embodied agents、科學發現、個人化與推薦、DevOps，以及企業工作流程。這個範圍很廣，表示「程式碼作為 harness」不是只適用於單一 coding benchmark，而是能延伸到多種 agent 場景。\u003C\u002Fp>\u003Cp>更重要的是，作者也把目前還卡住的地方講得很直接。像是：評估不能只看最後任務有沒有成功；如果回饋不完整，驗證就會變難；harness 的改進要避免引入回歸；多代理共享狀態要一致；安全敏感操作需要人類監督；多模態環境也還需要支援。\u003C\u002Fp>\u003Cp>這些限制其實很有價值，因為它們指出現在 agent 系統真正脆弱的地方。模型也許能吐出看起來合理的步驟，但外面的 code layer 還要處理狀態、錯誤回復、安全性，這些都不是單一終點指標能完整描述的。\u003C\u002Fp>\u003Ch2>對開發者有什麼影響\u003C\u002Fh2>\u003Cp>如果你正在做 agentic software，這篇的價值在於它逼你換一種工程師視角看問題。不要只想 prompt 怎麼寫，也要想 harness 怎麼設計。因為真正讓 agent 能跑、能重試、能驗證的，通常就是這層程式碼。\u003C\u002Fp>\u003Cp>對 production 來說，這個框架很有用。code-centric harness 可以更容易承載長流程工作、保留跨步驟狀態，還能插入明確的驗證節點。當 agent 出錯時，也比較容易 debug，因為它的行動是透過程式碼介面被中介，不是完全藏在自由輸出的文字流裡。\u003C\u002Fp>\u003Cp>但這篇也沒有把問題講得太樂觀。多代理共享狀態依然難搞。安全敏感操作還是需要人類監督。只看最終任務成功與否，也不足以判斷 harness 是否真的穩健。這些都意味著，agent 系統的品質不只在模型本身，而在模型外面那一整圈可執行、可檢查、可恢復的設計。\u003C\u002Fp>\u003Ch2>實作上該怎麼理解這個框架\u003C\u002Fh2>\u003Cp>最實際的 takeaway 是：把程式碼當成 agent 的基礎設施，而不是模型剛好會說的一種語言。這會改變你設計 agent stack 的方式。你可能會更重視明確的介面、更細的 state management、更多驗證鉤子，以及多代理之間如何共享 artifact。\u003C\u002Fp>\u003Cp>這篇綜述沒有宣稱這套方法能直接解決可靠性問題。它比較像是在幫下一波 agent 工程建立共同語言。當大家都在做 coding assistant、自動化系統或多代理 workflow 時，有一個「harness」視角，會比把所有東西拆成孤立模組更好討論，也更好落地。\u003C\u002Fp>\u003Cp>如果要用一句話總結，這篇不是在推一個新模型，而是在推一個設計模式：在 agent 時代，程式碼本身就是運行代理的框架。模型負責想，harness 負責讓它真的做得出來、查得到、接得\u003Ca href=\"\u002Fnews\u002Fdashattention-differentiable-adaptive-sparse-attention-zh\">上下\u003C\u002Fa>一步。\u003C\u002Fp>\u003Cul>\u003Cli>程式碼被定義成代理的運行層，而不只是輸出結果。\u003C\u002Fli>\u003Cli>綜述把系統拆成介面、機制、以及多代理擴展三層。\u003C\u002Fli>\u003Cli>它的重點是提供一個可執行、可驗證的 agent 設計視角。\u003C\u002Fli>\u003C\u002Ful>\u003Cp>對\u003Ca href=\"\u002Ftag\u002F台灣開發者\">台灣開發者\u003C\u002Fa>來說，這種框架特別像在提醒一件事：做 agent 不只是接 \u003Ca href=\"\u002Ftag\u002Fapi\">API\u003C\u002Fa>、拼 prompt，而是要把狀態、工具、驗證、協作一起設計進同一個程式骨架裡。這篇論文講的，就是那個骨架。\u003C\u002Fp>","這篇綜述把程式碼定位成代理系統的運行層，串起推理、動作、記憶與驗證，重點在架構視角而非新模型。","arxiv.org","https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.18747",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779173040130-zcyg.png",[13,14,15,16,17,18],"agent harness","code-as-infrastructure","planning","memory","tool use","multi-agent systems","zh",2,false,"2026-05-19T06:43:29.625994+00:00","2026-05-19T06:43:29.523+00:00","done","d5a453c5-2a6e-4a7f-8afe-bb274dd0cf1c","code-becomes-the-agent-harness-zh","research","576ffe2e-a54b-4030-84ea-8cc6eeb4f76f","published","2026-05-19T09:00:32.971+00:00",[32,33,34],"程式碼在 agent 系統裡不再只是輸出，而是運行底座。","這篇綜述的價值在整理介面、機制與多代理擴展三層框架。","摘要沒有 benchmark 數字，重點是概念整理與工程視角。","0c35a120-52fc-41fc-afa3-d404eb934158","[-0.010248019,-0.003011778,0.019590288,-0.081978165,-0.012430783,-0.008127422,-0.035372186,0.0051639555,0.034964018,0.013709923,-0.022115113,0.014511705,-0.012415802,0.0049216724,0.12866752,0.033336636,0.010554454,0.023580449,0.01488687,-0.009860631,0.028266728,0.013303175,0.00298492,-0.00207453,-0.0009992013,-0.006111376,0.026917292,0.009507293,0.047873013,0.005525154,0.01321998,0.01742581,-0.030748503,0.0070700333,0.029102726,-0.0011720349,0.009716173,-0.035194844,0.008274845,0.0020173995,-0.015657982,-0.020420969,0.011973888,0.009032435,6.9106965e-05,0.028908454,0.0047106342,-0.04470119,-0.028248632,0.0016999518,-0.002195874,0.015798064,0.008280642,-0.15033951,-0.01883516,-0.02154476,-0.029781397,0.01678329,0.003070653,0.018184213,0.0027622115,0.0004966727,-0.031182075,-0.018613888,0.004117528,-0.016519118,0.028468698,0.02106257,-0.021192327,-0.012322146,-0.033968884,0.01575313,0.016776692,-0.01589905,0.0213396,-0.013915307,-0.014557275,0.009315413,0.006866402,0.028832296,0.0010428878,-0.0091268765,0.009002377,-0.0021868562,0.0030206807,-0.023293987,0.021204608,-0.0044370894,0.011119438,0.0131596625,0.0048010135,0.0031379063,0.036954887,0.015162221,-0.012100315,-0.024962494,-0.03983031,0.0027893977,0.015344443,0.022974864,0.008235674,-0.01706634,0.005179521,-0.010873312,0.011215188,-0.01915727,-0.008314097,0.007157406,0.0022449736,0.018802997,-0.005912739,-0.021680081,-0.017725395,0.008794433,-0.0078488365,-0.12197263,0.011035175,-0.0039408407,-0.006744268,0.011917046,-0.036828287,0.023102233,-0.005551087,0.002629218,0.01410025,0.005794123,0.00897546,0.011094914,-0.021252133,0.0075521413,-0.03161487,0.008838426,-0.011322382,0.009125649,-0.006514195,0.017130446,0.00404078,-0.006738511,-0.03820376,-0.022608988,-0.020734202,0.037773933,-0.010753576,-0.017105415,-0.025050834,0.00050056993,-0.041496277,0.008714995,-0.012856582,-0.03754275,0.026377197,-0.018927595,-0.002032554,-0.012483015,0.026146678,-0.024353078,0.012491177,-0.0008914066,0.018523298,-0.0029719996,0.0027131832,-0.026024451,0.007579042,-0.0032741642,-0.008379903,0.0075398874,-0.003801548,-0.0011135427,-0.0025576046,0.038699735,0.009749876,-0.0022407267,-0.019099362,-0.007890494,0.012789955,-0.0156715,0.009496818,-0.020933105,0.024986975,-0.02601522,0.0012570253,-0.0041013025,-0.018189967,0.023477433,-6.3616295e-07,0.0060480093,-0.0075040665,0.0028833859,0.013327787,-0.011891609,-0.032357212,-0.0056204763,0.0125869075,-0.031466804,0.0072476026,-0.029741004,0.006723889,-0.001037349,-0.0040665665,0.031159543,0.021273613,0.014959775,0.027607532,-0.05945458,0.010564758,-0.0031977443,-0.0031535595,-0.039633624,-0.0155665185,0.030071748,-0.021089658,0.022417177,0.0321029,-0.0064767324,0.014792853,-0.012374091,-0.005278917,-0.0031086903,0.0035874688,0.013565863,0.020879587,-0.0018508232,0.0051080077,0.013613312,-0.012252322,-0.0063353973,0.0069592963,0.004854852,-0.026088407,0.015564776,-0.0053359047,0.02426974,0.0018820778,0.019753648,0.019073602,-0.0077226376,0.014976784,0.006769677,0.011154076,0.010831688,-0.016218342,0.021899022,0.007644759,0.018859606,0.022704657,0.007309491,-0.028602943,0.03734328,-0.0017567154,0.0111004375,-0.014578174,0.021543356,-0.0018818805,-0.018292576,-0.018719174,0.011198665,0.0075243614,-0.0024516552,-0.005497753,-0.0017311766,-0.005625939,0.0079153525,-0.017533485,0.0115976,-0.007273123,0.020769995,0.018533079,0.029123103,-0.0085620275,-0.008897846,-0.0065996414,0.038535316,-0.0010699284,0.018550504,0.009185752,0.0148167005,-0.065970995,0.04117375,0.017932924,-0.006740238,0.0052785133,0.011145346,-0.006852827,0.0072505735,-0.01586488,0.0020429858,-0.012869673,-0.0038925176,-0.0074370494,0.007924502,0.0031181718,0.006893411,-0.021581553,0.0035429965,0.005909618,-0.029081656,-0.013783084,0.018009579,0.015493306,-0.006016151,0.02803863,-0.027976288,0.0020931615,0.02296673,-0.012697725,-0.018107943,0.003263364,0.015383865,-0.006979793,-0.011316799,-0.009612215,-0.0077258,0.012046599,-0.019814612,0.0121756615,0.008595957,-0.014919109,-0.025212564,-0.0044157016,0.0035255991,0.020969002,-0.0038147783,-0.010917946,-0.00036134457,-0.019915037,0.003512814,-0.003574719,0.0011645389,-0.0013210735,-0.00953764,0.007553925,0.015035466,0.03733376,-0.046858896,0.027391296,0.0046522683,-0.014657159,-0.014985206,-0.01558508,0.015744353,0.010197973,-0.029846711,0.009588032,0.005264695,-0.015381283,0.023088718,0.006858221,-0.0024138326,-0.013971526,-0.019236038,0.015075915,0.0026441275,0.018667048,-0.010060451,-0.042143308,0.010075455,-0.006314252,0.00782937,0.01329066,0.011155238,0.028172184,0.009596158,-0.021815022,-0.017157255,0.016011354,-0.031562738,0.03532167,0.018717462,-0.015152909,-0.0033266821,-0.0027715398,0.012821972,-0.009215101,-0.0023916266,0.0049866526,-0.0077726613,-0.025077963,0.0018708911,-0.005405414,0.008056314,-0.0064309305,0.03407905,-0.029446745,0.023378305,-0.039765805,0.019190114,0.022658326,-0.008359643,-0.02464265,0.03061812,0.00480145,-0.00970563,-0.023228735,-0.01688076,-0.007459392,-0.014662865,-0.0075925696,0.014170682,-0.027188191,-0.0089292275,0.006357992,0.00025823066,0.010851049,0.004782414,-0.005160164,0.007600534,-0.0037261657,0.0020476938,0.027985929,-0.0028359217,-0.0028555964,0.009274989,0.03781375,0.029988969,0.0119831385,-0.013671292,-0.028408213,-0.01075587,-0.007825273,-0.018994931,-0.009503633,0.0046321168,-0.008626058,0.016221229,-0.0134870745,-0.0011843846,-0.030380815,-0.007009,-0.007566904,-0.012400993,0.0036714105,0.005193789,-0.002920992,-0.025813483,-0.008960893,0.0030695782,-0.023595266,0.0069060996,-0.02590964,-0.032432973,0.01912315,0.0041478253,-0.0012298335,0.0059870533,0.016368024,0.008717842,-0.007700769,0.0023660837,-0.031028137,0.009784328,0.016352518,0.014999779,-0.010149129,0.010412707,0.01653475,-0.0064822007,-0.0035619033,0.024083626,-0.0008658091,-0.01975019,0.0066715875,0.007865459,0.00026852565,-0.021456385,0.01178677,-0.016175907,-0.019378513,-0.001662924,-0.008972447,0.009938433,-0.023558704,0.0074904477,0.027528375,-0.0038289702,-0.019969033,0.009297581,-0.01777281,0.009031721,-0.0006123229,-0.013706432,0.017427443,-0.00012005552,0.03675129,-0.030215854,0.007058162,0.04356153,-0.011244757,0.019363312,-0.015883418,-0.0047195973,0.011647433,0.019579947,0.010123324,-0.027235394,-0.035010945,-0.0083740605,-0.00582762,-0.0041066553,0.014167931,-0.011197611,0.007594132,-0.026076429,-0.008974919,-0.008416755,0.025106685,0.006470974,-0.032185372,0.011361958,0.010712592,-0.024481514,-0.0077919792,0.0019608007,0.03461154,0.00074822165,-0.009946302,-0.009823651,0.022790281,0.005130006,-0.008531872,0.0109841805,0.011238446,-0.028049868,-0.000117127565,0.033701092,-0.041990947,0.0036313927,0.015276113,0.032476768,0.021919638,0.035676975,-0.0062625753,-0.00076922873,0.0039002215,-0.014578499,-0.00383391,-0.0146966325,0.015855806,-0.005081234,-0.035686314,0.011755727,-0.034230772,-0.032964543,-0.01515821,0.0019693882,0.03239415,-0.11800031,-0.024124643,0.016813582,-0.014828832,0.00635355,-0.0108712055,0.00690117,-0.015107771,-0.012849858,-0.010965349,-0.00046073683,-0.02441431,0.022585096,0.0047946083,-0.021188235,-0.010614389,-0.025334537,-0.024870891,0.0124457115,0.0043343464,0.011443291,-0.00257577,-0.006173739,0.0072474694,-0.013591244,-0.004967703,0.0063363938,0.01283418,0.032199223,0.0007452353,-0.033943985,0.005328799,-0.008810009,0.000114858856,-0.008728143,0.0068743927,0.021493163,0.0038486347,0.0044171545,0.0075499257,0.0023645516,0.00087482505,-0.0126543315,-0.018051658,0.0036002,-0.0031630395,0.016498974,-0.021612251,0.009773551,0.028986983,-0.05833233,-0.012142872,0.0033164707,-0.037609804,-0.014652533,0.005825488,-0.0042568576,-0.00058175065,0.00061462604,0.010550735,0.015139512,0.0015068761,-0.016204342,0.021847295,-0.0036133896,0.004072983,-0.011693842,0.02530632,0.013486179,-0.013216341,0.02609513,-5.4846143e-05,0.012021573,0.023283446,0.010798082,0.020514542,-0.00024072044,0.023844704,0.0052129934,0.013424096,-0.030176066,0.0038277584,-0.085156724,-0.029735737,0.010477688,-0.018585773,0.0106285345,-0.016441459,0.014375724,-0.0026091412,-0.011294099,-0.0021713476,-0.0030581043,-0.008640591,0.004312104,-0.04236018,-0.0018141628,0.012586672,-0.012528521,0.013980047,0.017485809,-0.027238866,-0.007866502,0.013249182,0.028538298,-0.031654563,-0.003943447,-0.0036126142,-0.030974878,-0.012745581,-0.01608016,0.0067054504,-0.0065558753,-0.1250789,-0.013362214,0.004770733,-0.0012017402,0.0070760646,0.0034912312,0.0021935494,-0.0041454444,0.031244695,-0.0058239726,0.0019539574,-0.0019059327,-0.0022988336,-0.032360196,0.001793308,0.106708966,0.0069313156,-0.014717754,-0.027737187,-0.0021841098,-0.007027827,-0.023471631,-0.004597112,0.022485651,-0.013684643,-0.012898399,0.033593968,-0.019982051,0.0047473763,0.0015973418,0.030181129,0.020907717,-0.02856442,-0.026568482,0.00016199211,-0.00497352,-0.0018015011,-0.037401263,0.010163842,0.0074507846,0.0135456035,0.0016856415,0.0072100223,-0.009363213,-0.011795086,0.004551177,-0.0059664445,-0.017804302,-0.002172802,0.010811482,-0.01556377,-0.057717953,-0.010289687,-0.012038883,-0.0045593316,0.019029897,-0.00040286576,0.015046976,0.0020837223,0.0022757063,0.005708395,-0.020104952,-0.00061991974,0.022745192,-0.038844306,0.012010081,0.018247658,0.023225393,-0.01293422,0.042276997,0.0017002288,-0.013634368,0.0030885418,-0.022815915,-0.014506556,-0.027643597,0.012729709,0.014487473,0.0135787865,-0.010917436,0.010808797,-0.018644646,0.005722494,0.009471809,0.010991094,0.00958499,0.01440609,0.0015961315,0.012208261,-0.008387486,-0.013805206,0.008086124,0.003175789,0.029614529,0.013691907,0.045759443,-0.010794041,0.0061811404,0.0015201933,-0.010419709,-0.0035660844,-0.0057946085,0.029776296,-0.007910441,-0.0153718805,0.03912843,-0.0127206985,0.017509706,0.024978168,0.013908322]",{"tags":38,"relatedLang":46,"relatedPosts":50},[39,40,42,43,45],{"name":14,"slug":14},{"name":13,"slug":41},"agent-harness",{"name":16,"slug":16},{"name":17,"slug":44},"tool-use",{"name":15,"slug":15},{"id":28,"slug":47,"title":48,"language":49},"code-becomes-the-agent-harness-en","Code Becomes the Agent Harness","en",[51,57,63,69,75,81],{"id":52,"slug":53,"title":54,"cover_image":55,"image_url":55,"created_at":56,"category":27},"d1c6850c-f832-471b-8beb-c0ebc809667d","peft-bench-fine-tuning-methods-benchmark-zh","PEFT-Bench 讓微調比較更公平","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779179048497-jm5y.png","2026-05-19T08:23:36.803043+00:00",{"id":58,"slug":59,"title":60,"cover_image":61,"image_url":61,"created_at":62,"category":27},"e24e6e7a-6181-476b-8583-339d854cec68","confident-ai-llm-evaluation-metrics-guide-zh","Confident AI 的 LLM 評估指標指南","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779178456675-x5m6.png","2026-05-19T08:13:46.193772+00:00",{"id":64,"slug":65,"title":66,"cover_image":67,"image_url":67,"created_at":68,"category":27},"eda7a80a-b234-4ada-90d1-a37b144251dc","rrfp-readiness-driven-pipeline-training-zh","RRFP 讓管線訓練跟著就緒跑","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779172442474-n21q.png","2026-05-19T06:33:31.287772+00:00",{"id":70,"slug":71,"title":72,"cover_image":73,"image_url":73,"created_at":74,"category":27},"475844e6-3e2c-49a6-aea0-86a94945d2c2","dashattention-differentiable-adaptive-sparse-attention-zh","DashAttention 讓稀疏長上下文可微","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779171840613-dq1r.png","2026-05-19T06:23:32.886786+00:00",{"id":76,"slug":77,"title":78,"cover_image":79,"image_url":79,"created_at":80,"category":27},"23a3d4c7-5cb7-40ae-a05b-1542364e786f","ibm-prompt-guide-turns-ai-guesses-into-outputs-zh","IBM 提示指南把猜答案變輸出","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779132863293-etob.png","2026-05-18T19:33:55.711767+00:00",{"id":82,"slug":83,"title":84,"cover_image":85,"image_url":85,"created_at":86,"category":27},"7c89c3bd-48cb-4b4e-942d-bbf0409fc392","cattle-trade-llm-bluffing-bargaining-benchmark-zh","Cattle Trade 要測 LLM 談判 bluffing","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779085437419-b0zw.png","2026-05-18T06:23:27.885037+00:00",[88,93,98,103,108,113,118,123,128,133],{"id":89,"slug":90,"title":91,"created_at":92},"f18dbadb-8c59-4723-84a4-6ad22746c77a","deepmind-bets-on-continuous-learning-ai-2026-zh","DeepMind 押注 2026 連續學習 AI","2026-03-26T08:16:02.367355+00:00",{"id":94,"slug":95,"title":96,"created_at":97},"f4a106cb-02a6-4508-8f39-9720a0a93cee","ml-papers-of-the-week-github-research-desk-zh","每週 ML 論文清單，為何紅到 GitHub","2026-03-27T01:11:39.284175+00:00",{"id":99,"slug":100,"title":101,"created_at":102},"c4f807ca-4e5f-47f1-a48c-961cf3fc44dc","ai-ml-conferences-to-watch-in-2026-zh","2026 AI 研討會投稿時程整理","2026-03-27T01:51:53.874432+00:00",{"id":104,"slug":105,"title":106,"created_at":107},"9f50561b-aebd-46ba-94a8-363198aa7091","openclaw-agents-manipulated-self-sabotage-zh","OpenClaw Agent 會自己搞砸自己","2026-03-28T03:03:18.786425+00:00",{"id":109,"slug":110,"title":111,"created_at":112},"11f22e92-7066-4978-a544-31f5f2156ec6","vega-learning-to-drive-with-natural-language-instructions-zh","Vega：使用自然語言指示進行自駕車控制","2026-03-28T14:54:04.847912+00:00",{"id":114,"slug":115,"title":116,"created_at":117},"a4c7cfec-8d0e-4fec-93cf-1b9699a530b8","drive-my-way-en-zh","Drive My Way：個性化自駕車風格的實現","2026-03-28T14:54:26.207495+00:00",{"id":119,"slug":120,"title":121,"created_at":122},"dec02f89-fd39-41ba-8e4d-11ede93a536d","training-knowledge-bases-with-writeback-rag-zh","用 WriteBack-RAG 強化知識庫提升檢索效能","2026-03-28T14:54:45.775606+00:00",{"id":124,"slug":125,"title":126,"created_at":127},"3886be5c-a137-40cc-b9e2-0bf18430c002","packforcing-efficient-long-video-generation-method-zh","PackForcing：短影片訓練也能生成長影片","2026-03-28T14:55:02.688141+00:00",{"id":129,"slug":130,"title":131,"created_at":132},"72b90667-d930-4cc9-8ced-aaa0f8968d44","pixelsmile-toward-fine-grained-facial-expression-editing-zh","PixelSmile：提升精細臉部表情編輯的新方法","2026-03-28T14:55:20.678181+00:00",{"id":134,"slug":135,"title":136,"created_at":137},"cf046742-efb2-4753-aef9-caed5da5e32e","adaptive-block-scaled-data-types-zh","IF4：神經網路量化的聰明選擇","2026-03-31T06:00:36.990273+00:00"]