[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-how-to-build-harness-for-ai-agents-zh":3,"article-related-how-to-build-harness-for-ai-agents-zh":31,"series-ai-agent-97bb6252-5422-45e5-ad39-8e541ce6a4ae":81},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":23,"views":27,"created_at":28,"published_at":29,"topic_cluster_id":30},"97bb6252-5422-45e5-ad39-8e541ce6a4ae","how-to-build-harness-for-ai-agents-zh","如何打造 AI Agent Harness","\u003Cp data-speakable=\"summary\">這篇教你把 \u003Ca href=\"\u002Ftag\u002Fai-agent\">AI Agent\u003C\u002Fa> 的模型、工具、驗證和狀態包進一個可控的 harness，做出可測試、可重試、可擴充的代理流程。\u003C\u002Fp>\u003Cp>這篇給想超越 prompt tuning 的開發者看。照著做完，你會得到一個可運作的 \u003Ca href=\"\u002Ftag\u002Fagent\">agent\u003C\u002Fa> harness 藍圖，知道模型負責\u003Ca href=\"\u002Fnews\u002Fwhy-pumas-should-keep-efrain-juarez-zh\">什麼\u003C\u002Fa>、控制層負責什麼，還能把每一步驗收清楚。\u003C\u002Fp>\u003Cp>核心觀念很單純：Agent 不是只有模型，而是模型外面再包一層 harness，去決定它看得到\u003Ca href=\"\u002Fnews\u002Fwhy-windsurfing-equipment-market-is-still-niche-zh\">什麼\u003C\u002Fa>、能做什麼、結果怎麼驗證。這種切法會讓除錯更直接，也更適合放進正式環境。\u003C\u002Fp>\u003Ch2>開始之前\u003C\u002Fh2>\u003Cul>\u003Cli>OpenAI 或 Anthropic 帳號，並準備好有效 API key。\u003C\u002Fli>\u003Cli>Node 20+ 或 Python 3.11+。\u003C\u002Fli>\u003Cli>Git 2.40+。\u003C\u002Fli>\u003Cli>終端機與程式編輯器。\u003C\u002Fli>\u003Cli>熟悉 JSON、HTTP request、function calling。\u003C\u002Fli>\u003Cli>可選：\u003Ca href=\"https:\u002F\u002Fplatform.openai.com\u002Fdocs\">OpenAI 文件\u003C\u002Fa>與\u003Ca href=\"https:\u002F\u002Fgithub.com\u002Fopenai\u002Fopenai-openapi\">OpenAI GitHub repo\u003C\u002Fa>，或你選用模型供應商的對應文件與 SDK。\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Step 1: 定義 Agent 邊界\u003C\u002Fh2>\u003Cp>目的：先把模型和 harness 分開，之後才有辦法獨立調整行為、除錯與權限。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779844559339-17vc.png\" alt=\"如何打造 AI Agent Harness\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>請先寫出三個區塊：環境輸入、harness 規則、模型輸出。這一步的具名產出是「邊界圖」。\u003C\u002Fp>\u003Cpre>\u003Ccode>Environment -> Harness -> Model -> Harness -> Tools\u002FChecks -> Final Output\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>驗收：你應該能用一句話說出哪個元件可以呼叫 \u003Ca href=\"\u002Ftag\u002Fapi\">API\u003C\u002Fa>、哪個元件負責記憶、哪個元件決定結果能不能放行。如果說不清楚，邊界還不夠明確。\u003C\u002Fp>\u003Ch2>Step 2: 設計 Observation Schema\u003C\u002Fh2>\u003Cp>目的：控制模型每一輪能看見\u003Ca href=\"\u002Fnews\u002Fwhy-windsurfing-equipment-market-is-niche-zh\">什麼\u003C\u002Fa>，避免把一大坨原始對話直接丟給它。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779844558418-a54y.png\" alt=\"如何打造 AI Agent Harness\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>請把輸入整理成固定 JSON。這一步的具名產出是「Observation Schema」。\u003C\u002Fp>\u003Cpre>\u003Ccode>{\n  \"goal\": \"summarize invoice\",\n  \"messages\": [\"...\"],\n  \"tool_results\": [],\n  \"constraints\": [\"no PII\", \"return JSON\"]\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>驗收：你應該看到每一輪都用同一組 key，即使 value 改變也不會亂掉。這代表 harness 已經在控制觀測，而不是讓 prompt 自己長大。\u003C\u002Fp>\u003Ch2>Step 3: 註冊 Allowed Actions\u003C\u002Fh2>\u003Cp>目的：限制 agent 能做的事，只留下可審核、可映射到真實 API 的動作。\u003C\u002Fp>\u003Cp>請建立工具登錄表，列出名稱、輸入格式、權限規則。這一步的具名產出是「Tool Registry」。\u003C\u002Fp>\u003Cpre>\u003Ccode>tools:\n  - name: search_docs\n    input: { query: string }\n  - name: fetch_record\n    input: { id: string }\n  - name: submit_answer\n    input: { text: string }\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>驗收：你應該能拒絕任何不在登錄表中的 action。若模型要求不存在的工具，harness 要回傳受控錯誤，而不是直接執行。\u003C\u002Fp>\u003Ch2>Step 4: 加入 Validation 與 Retry\u003C\u002Fh2>\u003Cp>目的：在結果被接受前先檢查結構、政策與任務規則，避免第一個答案就直接進入系統。\u003C\u002Fp>\u003Cp>請實作驗證流程，包含 JSON 格式、禁用內容、任務條件，然後把失敗原因回灌給模型重試。這一步的具名產出是「Validation Pipeline」。\u003C\u002Fp>\u003Cpre>\u003Ccode>if !valid_json(output) or !passes_policy(output) {\n  retry_with_error_context();\n}\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>驗收：你應該看到 malformed response 變少，且失敗不再默默吞掉。最好同時記錄 accepted 與 rejected，方便之後做回歸測試。\u003C\u002Fp>\u003Ch2>Step 5: 建立 State 與 Memory\u003C\u002Fh2>\u003Cp>目的：只保留對任務有用的狀態，避免把整段聊天史都當成記憶。\u003C\u002Fp>\u003Cp>請把持久狀態和暫時上下文分開，並且只在驗證通過後更新。這一步的具名產出是「Session State」。\u003C\u002Fp>\u003Cp>驗收：你應該看到 agent 在多輪互動下行為更穩定，因為狀態由 harness 接手，不再靠模型自己重建背景。到這裡，你已經有一個基本公式：模型負責推理，harness 負責控制。\u003C\u002Fp>\u003Ch2>常見錯誤\u003C\u002Fh2>\u003Cul>\u003Cli>\u003Cstrong>把所有工作都丟給 prompt。\u003C\u002Fstrong> 修法：把工具選擇、schema 檢查、重試流程移到 harness，prompt 只保留推理所需資訊。\u003C\u002Fli>\u003Cli>\u003Cstrong>給模型太多上下文。\u003C\u002Fstrong> 修法：改用精簡的 observation schema，並在每輪前摘要舊狀態。\u003C\u002Fli>\u003Cli>\u003Cstrong>直接相信第一次輸出。\u003C\u002Fstrong> 修法：先驗證結構與政策，再決定接受、重試或升級處理。\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>接下來可以看什麼\u003C\u002Fh2>\u003Cp>如果這個 harness 已經跑起來，下一步可以加 evaluation scripts、tracing、sandboxed tool execution，讓你能持續量測可靠度，並把 agent 推進到 production 等級。\u003C\u002Fp>","這篇教你把 AI Agent 的模型、工具、驗證和狀態包進一個可控的 harness，做出可測試、可重試、可擴充的代理流程。","zhuanlan.zhihu.com","https:\u002F\u002Fzhuanlan.zhihu.com\u002Fp\u002F2036738130649330427",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779844559339-17vc.png","ai-agent","zh","171dca3d-20d5-4205-a20d-406cd426fc6d",[17,18,19,20,21,22],"AI agent","harness","function calling","JSON schema","validation","state management",[24,25,26],"先切清楚模型與控制層的邊界，再談 agent 行為。","用固定 observation schema 和 tool registry，讓輸入與動作都可控。","加上 validation、retry、state，才能把 agent 變成可維護系統。",7,"2026-05-27T01:15:28.081189+00:00","2026-05-27T01:15:28.064+00:00","e3b68196-9e64-4c18-a3b6-a73e73bfb367",{"tags":32,"relatedLang":40,"relatedPosts":44},[33,34,35,36,38],{"name":21,"slug":21},{"name":18,"slug":18},{"name":17,"slug":13},{"name":20,"slug":37},"json-schema",{"name":19,"slug":39},"function-calling",{"id":15,"slug":41,"title":42,"language":43},"how-to-build-harness-for-ai-agents-en","How to Build a Harness for AI Agents","en",[45,51,57,63,69,75],{"id":46,"slug":47,"title":48,"cover_image":49,"image_url":49,"created_at":50,"category":13},"ef96a410-24bd-4e35-8536-439f21f820e6","claude-code-dynamic-workflow-ai-harness-zh","Claude Code 動態工作流：AI 自寫 Harness","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781035378200-qkm9.png","2026-06-09T20:02:21.942031+00:00",{"id":52,"slug":53,"title":54,"cover_image":55,"image_url":55,"created_at":56,"category":13},"9fb91fbe-64cd-4732-aba7-5b20daacf962","agent-orchestration-enterprise-ai-layer-zh","企業 AI 缺的是編排層","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780984981291-rodj.png","2026-06-09T06:02:30.929215+00:00",{"id":58,"slug":59,"title":60,"cover_image":61,"image_url":61,"created_at":62,"category":13},"2e389faa-a4ab-4f7a-b6da-c2ba69d5f14b","ai-agents-use-blockchain-trust-layer-zh","AI 代理用區塊鏈當信任層","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780980509390-6s0i.png","2026-06-09T04:48:01.259033+00:00",{"id":64,"slug":65,"title":66,"cover_image":67,"image_url":67,"created_at":68,"category":13},"1c433948-634b-47e4-a119-dd567203a712","8-rag-patterns-demos-into-prod-zh","8 種 RAG 模式把 Demo 變上線","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780971552397-h12o.png","2026-06-09T02:18:36.130013+00:00",{"id":70,"slug":71,"title":72,"cover_image":73,"image_url":73,"created_at":74,"category":13},"7d860405-aca6-486b-8de0-1c5193a3b06d","fine-tuning-beats-rag-style-not-facts-zh","當目標是文風不是事實時，微調比 RAG 更有效","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780924689232-5elu.png","2026-06-08T13:17:25.235242+00:00",{"id":76,"slug":77,"title":78,"cover_image":79,"image_url":79,"created_at":80,"category":13},"3d1e5ef7-8f31-4e57-b286-306825d7f38e","openclaw-small-business-ai-staff-zh","OpenClaw把AI變成夜班員工","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780904888882-6w0v.png","2026-06-08T07:47:27.229503+00:00",[82,87,92,97,102,107,112,117,122,127],{"id":83,"slug":84,"title":85,"created_at":86},"4ae1e197-1d3d-4233-8733-eafe9cb6438b","claude-now-uses-your-pc-to-finish-tasks-zh","Claude 開始幫你操作電腦","2026-03-26T07:20:48.457387+00:00",{"id":88,"slug":89,"title":90,"created_at":91},"5bede67f-e21c-413d-9ab8-54a3c3d26227","googles-2026-ai-agent-report-decoded-zh","Google 2026 AI Agent 報告解讀","2026-03-26T11:15:22.651956+00:00",{"id":93,"slug":94,"title":95,"created_at":96},"2987d097-563f-46c7-b76f-b558d8ef7c2b","kimi-k25-review-stronger-still-not-legend-zh","Kimi K2.5 評測：更強，但還不是神作","2026-03-27T07:15:55.277513+00:00",{"id":98,"slug":99,"title":100,"created_at":101},"95c9053b-e3f4-4cb5-aace-5c54f4c9e044","claude-code-controls-mac-desktop-zh","Claude Code 也能操控 Mac 了","2026-03-28T03:01:58.58121+00:00",{"id":103,"slug":104,"title":105,"created_at":106},"dc58e153-e3a8-4c06-9b96-1aa64eabbf5f","cloudflare-100x-faster-ai-agent-sandbox-zh","Cloudflare 的 AI 沙箱跑超快","2026-03-28T03:09:44.142236+00:00",{"id":108,"slug":109,"title":110,"created_at":111},"1c8afc56-253f-47a2-979f-1065ff072f2a","openai-backs-isara-agent-swarm-bet-zh","OpenAI 挺 Isara 的 agent swarm …","2026-03-28T03:15:27.513155+00:00",{"id":113,"slug":114,"title":115,"created_at":116},"7379b422-576e-45df-ad5a-d57a0d9dd467","openai-plan-automated-ai-researcher-zh","OpenAI 想做自動化 AI 研究員","2026-03-28T03:17:42.090548+00:00",{"id":118,"slug":119,"title":120,"created_at":121},"48c9889e-86df-450b-a356-e4a4b7c83c5b","harness-engineering-ai-agent-reliability-2026-zh","駕馭工程：從「馬具」到「作業系統」，AI Agent 可靠性的終極密碼","2026-03-31T06:42:53.556721+00:00",{"id":123,"slug":124,"title":125,"created_at":126},"96d8e8c8-1edd-475d-9145-b1e7a1b02b65","mcp-explained-from-prompts-to-production-zh","MCP 怎麼把提示詞變工作流","2026-04-01T09:24:39.321274+00:00",{"id":128,"slug":129,"title":130,"created_at":131},"f2ca7720-b471-4ce5-9336-2a9ac2a876fd","amazon-bedrock-agents-multi-agent-workflows-zh","Amazon Bedrock Agents 進入多代理工作流","2026-04-01T09:30:29.945429+00:00"]