[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-8-steps-build-production-rag-with-langchain-zh":3,"article-related-8-steps-build-production-rag-with-langchain-zh":31,"series-ai-agent-37a5e429-4235-439c-9b05-bb377085462c":83},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":23,"views":27,"created_at":28,"published_at":29,"topic_cluster_id":30},"37a5e429-4235-439c-9b05-bb377085462c","8-steps-build-production-rag-with-langchain-zh","8 步驟打造可上線的 LangChain RAG","\u003Cp data-speakable=\"summary\">這篇教你用 \u003Ca href=\"\u002Ftag\u002Flangchain\">LangChain\u003C\u002Fa>、向量資料庫、LangSmith 與 FastAPI，從文件匯入一路做到可部署、可追蹤、可維運的生產級 \u003Ca href=\"\u002Ftag\u002Frag\">RAG\u003C\u002Fa>。\u003C\u002Fp>\u003Cp>這篇給已經做過基礎檢索增強生成原型的開發者。你照著做完，會拿到一套能處理文件匯入、切塊、向量索引、檢索、觀測、\u003Ca href=\"\u002Ftag\u002Fapi\">API\u003C\u002Fa> 驗證與部署的實作流程。\u003C\u002Fp>\u003Cp>本文第一次提到的工具都附上官方文件或專案連結，方便你邊做邊對照：[LangChain 文件](https:\u002F\u002Fpython.langchain.com\u002Fdocs\u002F)、[LangChain \u003Ca href=\"\u002Ftag\u002Fgithub\">GitHub\u003C\u002Fa> 倉庫](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain)、[Chroma](https:\u002F\u002Fwww.trychroma.com\u002F)、[Chroma GitHub 倉庫](https:\u002F\u002Fgithub.com\u002Fchroma-core\u002Fchroma)、[Supabase pgvector 文件](https:\u002F\u002Fsupabase.com\u002Fdocs\u002Fguides\u002Fdatabase\u002Fextensions\u002Fpgvector)、[LangSmith 文件](https:\u002F\u002Fdocs.langchain.com\u002Flangsmith\u002F)、[LangGraph GitHub 倉庫](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flanggraph)。\u003C\u002Fp>\u003Ch2>開始之前\u003C\u002Fh2>\u003Cul>\u003Cli>Python 3.11+\u003C\u002Fli>\u003Cli>Node 20+，如果你要另外跑前端或測試工具\u003C\u002Fli>\u003Cli>Docker 24+\u003C\u002Fli>\u003Cli>LangSmith 帳號與 API key\u003C\u002Fli>\u003Cli>Supabase 帳號與專案，若要用託管 Postgres 與 pgvector\u003C\u002Fli>\u003Cli>OpenAI API key，或其他 embedding 與聊天模型供應商金鑰\u003C\u002Fli>\u003Cli>Git 2.40+\u003C\u002Fli>\u003Cli>具備 RAG、embedding、向量搜尋的基本概念\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Step 1: 建立專案工作區\u003C\u002Fh2>\u003Cp>先把專案拆成資料匯入、索引、服務三個區塊，避免一開始就變成難以維護的筆記本原型。這樣後面除錯時，你才知道問題是出在切塊、索引，還是 API。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780178597493-4hz7.png\" alt=\"8 步驟打造可上線的 LangChain RAG\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cpre>\u003Ccode>mkdir production-rag && cd production-rag\npython -m venv .venv\nsource .venv\u002Fbin\u002Factivate\npip install langchain chromadb fastapi uvicorn langgraph langsmith supabase psycopg[binary] python-dotenv\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到虛擬環境成功啟動，套件也順利安裝完成。接著執行 \u003Ccode>python -c \"import langchain, chromadb, fastapi\"\u003C\u002Fcode>，如果沒有任何錯誤訊息，就代表工作區已經可用。\u003C\u002Fp>\u003Ch2>Step 2: 匯入並切分文件\u003C\u002Fh2>\u003Cp>這一步的目標是把原始文件變成適合檢索的文字區塊，並保留來源資訊。之後你要回答「這段答案從哪裡來」時，metadata 會直接派上用場。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780178597614-8tz7.png\" alt=\"8 步驟打造可上線的 LangChain RAG\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cpre>\u003Ccode>from langchain_community.document_loaders import DirectoryLoader\nfrom langchain_text_splitters import RecursiveCharacterTextSplitter\n\nloader = DirectoryLoader(\".\u002Fdocs\", glob=\"**\u002F*.md\")\ndocs = loader.load()\n\nsplitter = RecursiveCharacterTextSplitter(\n    chunk_size=800,\n    chunk_overlap=120,\n)\nchunks = splitter.split_documents(docs)\nprint(len(docs), len(chunks))\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到切出來的 chunk 數量大於原始文件數量。這表示切分器已經把長文件拆成更適合檢索的單位，不會整份文件一起塞進索引。\u003C\u002Fp>\u003Ch2>Step 3: 建立向量索引\u003C\u002Fh2>\u003Cp>這一步要把文件嵌入後存進向量資料庫，讓系統能做語意搜尋。本機開發先用 Chroma，正式環境則可以改成 Supabase 搭配 pgvector，部署和權限管理都比較好控管。\u003C\u002Fp>\u003Cpre>\u003Ccode>from langchain_openai import OpenAIEmbeddings\nfrom langchain_community.vectorstores import Chroma\n\nembeddings = OpenAIEmbeddings(model=\"text-embedding-3-small\")\nvectorstore = Chroma.from_documents(\n    documents=chunks,\n    embedding=embeddings,\n    persist_directory=\".\u002Fchroma_db\",\n)\nvectorstore.persist()\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到本機資料夾被建立，或是資料成功寫入 pgvector 資料表。接著做一次相似度搜尋，如果回傳的是語意相關的段落，而不是隨機文字，就代表索引已經正常。\u003C\u002Fp>\u003Ch2>Step 4: 串接檢索與回答生成\u003C\u002Fh2>\u003Cp>這一步要先做出最小可用的 RAG 流程，確認「先找資料，再回答」的核心行為真的成立。只要這條主幹正確，後面再加優化才有意義。\u003C\u002Fp>\u003Cpre>\u003Ccode>retriever = vectorstore.as_retriever(search_kwargs={\"k\": 4})\nquery = \"RAG 裡的 hybrid search 是什麼？\"\nresults = retriever.invoke(query)\nprint(results[0].page_content[:200])\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到最相關的 chunk 出現在第一筆結果。若內容明顯對應查詢主題，表示 embedding、切塊與索引策略大致對齊，可以進入觀測與部署階段。\u003C\u002Fp>\u003Ch2>Step 5: 開啟觀測與追蹤\u003C\u002Fh2>\u003Cp>到了正式環境，檢索失敗常常不會直接報錯，只會回錯答案。這一步的產出是可在 LangSmith 看到的完整追蹤紀錄，讓你能檢查提示詞、檢索結果、延遲與最終回應。\u003C\u002Fp>\u003Cpre>\u003Ccode>import os\nos.environ[\"LANGSMITH_TRACING\"] = \"true\"\nos.environ[\"LANGSMITH_API_KEY\"] = \"your-key\"\nos.environ[\"LANGSMITH_PROJECT\"] = \"production-rag\"\n\n# 執行一次 chain，然後到 LangSmith 檢視 trace。\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該在 LangSmith 儀表板看到新的 run，裡面包含 retriever 輸出與模型回應。這會比只看主控台 log 更快找出是哪一步出問題。\u003C\u002Fp>\u003Ch2>Step 6: 加上驗證並提供 API\u003C\u002Fh2>\u003Cp>這一步的目標是把 RAG 包成可部署服務，並先擋掉未授權請求。正式\u003Ca href=\"\u002Fnews\u002Fanthropic-65b-h-round-claude-opus-4-8-zh\">上線\u003C\u002Fa>時，驗證、輸入檢查、環境變數管理都\u003Ca href=\"\u002Fnews\u002Fwhy-grok-build-is-not-ready-to-replace-claude-code-zh\">不能\u003C\u002Fa>省。\u003C\u002Fp>\u003Cpre>\u003Ccode>from fastapi import FastAPI, Header, HTTPException\n\napp = FastAPI()\nAPI_TOKEN = os.getenv(\"RAG_API_TOKEN\")\n\n@app.get(\"\u002Fanswer\")\ndef answer(q: str, authorization: str = Header(default=\"\")):\n    if authorization != f\"Bearer {API_TOKEN}\":\n        raise HTTPException(status_code=401, detail=\"Unauthorized\")\n    return {\"query\": q, \"status\": \"ok\"}\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該能用 Uvicorn 啟動服務，沒有 \u003Ca href=\"\u002Ftag\u002Ftoken\">token\u003C\u002Fa> 時拿到 401，有 token 時拿到 200。這代表 API 閘門已經先於檢索邏輯生效。\u003C\u002Fp>\u003Ch2>Step 7: 調整混合搜尋與 token 預算\u003C\u002Fh2>\u003Cp>這一步要讓檢索更穩定，因為單靠語意搜尋有時會漏掉專有名詞、型號或程式符號。混合搜尋能補上關鍵字訊號，token 預算則能避免\u003Ca href=\"\u002Fnews\u002Fmidjourney-web-updates-voice-reuse-prompts-zh\">上下文\u003C\u002Fa>爆掉與模型費用浪費。\u003C\u002Fp>\u003Cpre>\u003Ccode># 建議流程\n# 1. 先做向量相似度搜尋\n# 2. 再做關鍵字或 BM25 搜尋\n# 3. 合併並重新排序結果\n# 4. 只保留符合 token 預算的上下文\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到含有產品名、程式碼符號或稀有詞的問題，答案品質明顯提升。若最後送進模型的 prompt 仍然低於上下文上限，就表示預算控制有在運作。\u003C\u002Fp>\u003Ch2>Step 8: 用 LangGraph 編排多路徑流程\u003C\u002Fh2>\u003Cp>最後一步是把單一路徑 RAG 升級成可分支、可重試、可多跳推理的流程。當問題類型不同時，系統可以走不同節點，而不是硬套同一條鏈。\u003C\u002Fp>\u003Cpre>\u003Ccode>from langgraph.graph import StateGraph\n\n# 定義 retrieve -> grade -> refine -> answer 節點\n# 加上 fallback search 或多模態文件處理分支\n# 編譯圖後逐一路徑測試\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到不同類型的查詢觸發不同路徑，例如一般事實題、多跳題、或文件圖片題。這表示系統已經不是只會單次檢索，而是具備進階生產行為。\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>指標\u003C\u002Fth>\u003Cth>基準／優化前\u003C\u002Fth>\u003Cth>結果／優化後\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>答案可追溯性\u003C\u002Ftd>\u003Ctd>只能人工翻 log\u003C\u002Ftd>\u003Ctd>LangSmith 追蹤可直接看到檢索片段\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>檢索品質\u003C\u002Ftd>\u003Ctd>只有單一向量搜尋\u003C\u002Ftd>\u003Ctd>結合關鍵字與語意訊號的混合搜尋\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>部署準備度\u003C\u002Ftd>\u003Ctd>筆記本原型\u003C\u002Ftd>\u003Ctd>具備 token 驗證的 FastAPI 服務\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>常見錯誤\u003C\u002Fh2>\u003Cul>\u003Cli>切塊太大。修法是把 chunk_size 調小，先用幾個真實問題回測再定稿。\u003C\u002Fli>\u003Cli>沒開觀測。修法是先啟用 LangSmith trace，再開始調整提示詞。\u003C\u002Fli>\u003Cli>把所有 chunk 都塞進 prompt。修法是加上 top-k、重新排序與嚴格 token 預算。\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>接下來可以看什麼\u003C\u002Fh2>\u003Cp>下一步可以延伸到重排序、評估資料集、多模態檢索，以及代理式備援流程，讓你的 RAG 系統能處理更難的問題與更髒的資料。如果你要對照完整作法，可以再往 GraphRAG、ColPali 類型的多模態檢索與生產部署主題深入。\u003C\u002Fp>","這篇教你用 LangChain、向量資料庫、LangSmith 與 FastAPI，從文件匯入一路做到可部署、可追蹤、可維運的生產級 RAG。","www.freecodecamp.org","https:\u002F\u002Fwww.freecodecamp.org\u002Fnews\u002Fproduction-rag-with-langchain-vector-databases\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780178597493-4hz7.png","ai-agent","zh","1b25f514-9ed1-4c6f-b9d7-f56eb34033f5",[17,18,19,20,21,22],"LangChain","RAG","向量資料庫","LangSmith","FastAPI","LangGraph",[24,25,26],"先把文件匯入、切塊、索引拆成獨立步驟，後面才好除錯。","觀測與驗證要提早做，否則上線後很難定位錯誤。","混合搜尋與 token 預算是生產級 RAG 穩定度的關鍵。",4,"2026-05-30T22:02:48.14022+00:00","2026-05-30T22:02:48.121+00:00","e3b68196-9e64-4c18-a3b6-a73e73bfb367",{"tags":32,"relatedLang":42,"relatedPosts":46},[33,35,37,39,41],{"name":18,"slug":34},"rag",{"name":20,"slug":36},"langsmith",{"name":17,"slug":38},"langchain",{"name":21,"slug":40},"fastapi",{"name":19,"slug":19},{"id":15,"slug":43,"title":44,"language":45},"build-production-rag-with-langchain-in-8-steps-en","Build Production RAG with LangChain in 8 Steps","en",[47,53,59,65,71,77],{"id":48,"slug":49,"title":50,"cover_image":51,"image_url":51,"created_at":52,"category":13},"ef96a410-24bd-4e35-8536-439f21f820e6","claude-code-dynamic-workflow-ai-harness-zh","Claude Code 動態工作流：AI 自寫 Harness","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781035378200-qkm9.png","2026-06-09T20:02:21.942031+00:00",{"id":54,"slug":55,"title":56,"cover_image":57,"image_url":57,"created_at":58,"category":13},"9fb91fbe-64cd-4732-aba7-5b20daacf962","agent-orchestration-enterprise-ai-layer-zh","企業 AI 缺的是編排層","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780984981291-rodj.png","2026-06-09T06:02:30.929215+00:00",{"id":60,"slug":61,"title":62,"cover_image":63,"image_url":63,"created_at":64,"category":13},"2e389faa-a4ab-4f7a-b6da-c2ba69d5f14b","ai-agents-use-blockchain-trust-layer-zh","AI 代理用區塊鏈當信任層","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780980509390-6s0i.png","2026-06-09T04:48:01.259033+00:00",{"id":66,"slug":67,"title":68,"cover_image":69,"image_url":69,"created_at":70,"category":13},"1c433948-634b-47e4-a119-dd567203a712","8-rag-patterns-demos-into-prod-zh","8 種 RAG 模式把 Demo 變上線","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780971552397-h12o.png","2026-06-09T02:18:36.130013+00:00",{"id":72,"slug":73,"title":74,"cover_image":75,"image_url":75,"created_at":76,"category":13},"7d860405-aca6-486b-8de0-1c5193a3b06d","fine-tuning-beats-rag-style-not-facts-zh","當目標是文風不是事實時，微調比 RAG 更有效","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780924689232-5elu.png","2026-06-08T13:17:25.235242+00:00",{"id":78,"slug":79,"title":80,"cover_image":81,"image_url":81,"created_at":82,"category":13},"3d1e5ef7-8f31-4e57-b286-306825d7f38e","openclaw-small-business-ai-staff-zh","OpenClaw把AI變成夜班員工","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780904888882-6w0v.png","2026-06-08T07:47:27.229503+00:00",[84,89,94,99,104,109,114,119,124,129],{"id":85,"slug":86,"title":87,"created_at":88},"4ae1e197-1d3d-4233-8733-eafe9cb6438b","claude-now-uses-your-pc-to-finish-tasks-zh","Claude 開始幫你操作電腦","2026-03-26T07:20:48.457387+00:00",{"id":90,"slug":91,"title":92,"created_at":93},"5bede67f-e21c-413d-9ab8-54a3c3d26227","googles-2026-ai-agent-report-decoded-zh","Google 2026 AI Agent 報告解讀","2026-03-26T11:15:22.651956+00:00",{"id":95,"slug":96,"title":97,"created_at":98},"2987d097-563f-46c7-b76f-b558d8ef7c2b","kimi-k25-review-stronger-still-not-legend-zh","Kimi K2.5 評測：更強，但還不是神作","2026-03-27T07:15:55.277513+00:00",{"id":100,"slug":101,"title":102,"created_at":103},"95c9053b-e3f4-4cb5-aace-5c54f4c9e044","claude-code-controls-mac-desktop-zh","Claude Code 也能操控 Mac 了","2026-03-28T03:01:58.58121+00:00",{"id":105,"slug":106,"title":107,"created_at":108},"dc58e153-e3a8-4c06-9b96-1aa64eabbf5f","cloudflare-100x-faster-ai-agent-sandbox-zh","Cloudflare 的 AI 沙箱跑超快","2026-03-28T03:09:44.142236+00:00",{"id":110,"slug":111,"title":112,"created_at":113},"1c8afc56-253f-47a2-979f-1065ff072f2a","openai-backs-isara-agent-swarm-bet-zh","OpenAI 挺 Isara 的 agent swarm …","2026-03-28T03:15:27.513155+00:00",{"id":115,"slug":116,"title":117,"created_at":118},"7379b422-576e-45df-ad5a-d57a0d9dd467","openai-plan-automated-ai-researcher-zh","OpenAI 想做自動化 AI 研究員","2026-03-28T03:17:42.090548+00:00",{"id":120,"slug":121,"title":122,"created_at":123},"48c9889e-86df-450b-a356-e4a4b7c83c5b","harness-engineering-ai-agent-reliability-2026-zh","駕馭工程：從「馬具」到「作業系統」，AI Agent 可靠性的終極密碼","2026-03-31T06:42:53.556721+00:00",{"id":125,"slug":126,"title":127,"created_at":128},"96d8e8c8-1edd-475d-9145-b1e7a1b02b65","mcp-explained-from-prompts-to-production-zh","MCP 怎麼把提示詞變工作流","2026-04-01T09:24:39.321274+00:00",{"id":130,"slug":131,"title":132,"created_at":133},"f2ca7720-b471-4ce5-9336-2a9ac2a876fd","amazon-bedrock-agents-multi-agent-workflows-zh","Amazon Bedrock Agents 進入多代理工作流","2026-04-01T09:30:29.945429+00:00"]