[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-how-to-build-a-rag-pipeline-in-5-steps-zh":3,"tags-how-to-build-a-rag-pipeline-in-5-steps-zh":34,"related-lang-how-to-build-a-rag-pipeline-in-5-steps-zh":44,"related-posts-how-to-build-a-rag-pipeline-in-5-steps-zh":48,"series-ai-agent-e133ed69-fb56-495d-96f6-1e14d7ac3242":85},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":30,"topic_cluster_id":10,"embedding":10,"is_canonical_seed":20},"e133ed69-fb56-495d-96f6-1e14d7ac3242","5 步完成 RAG 管線","\u003Cp data-speakable=\"summary\">這篇教你用 5 個步驟做出 \u003Ca href=\"\u002Fnews\u002Fwhat-rag-is-and-why-it-matters-zh\">RAG\u003C\u002Fa> 管線，讓模型先檢索你的文件，再根據內容產生有依據的答案。\u003C\u002Fp>\u003Cp>這篇給想把 \u003Ca href=\"\u002Fnews\u002Fhow-to-build-vintage-llm-testbed-5-steps-zh\">LLM\u003C\u002Fa> 接到自家文件的開發者看。照著做完，你會得到一條可運作的 \u003Ca href=\"\u002Ftag\u002Frag\">RAG\u003C\u002Fa> 流程，能匯入文件、切分段落、產生 embedd\u003Ca href=\"\u002Fnews\u002Fai-finds-nine-year-linux-kernel-zero-day-zh\">in\u003C\u002Fa>gs、查回相關內容，最後生成有來源依據的回答。\u003C\u002Fp>\u003Cp>你也會有一套逐步驗收的方法，方便在上線前檢查檢索品質、延遲與資料新鮮度。\u003C\u002Fp>\u003Ch2>開始之前\u003C\u002Fh2>\u003Cul>\u003Cli>Node 20+ 或 Python 3.11+\u003C\u002Fli>\u003Cli>OpenAI API key，或其他聊天模型 API key\u003C\u002Fli>\u003Cli>Embedding 模型帳號，或本地 embedding runtime\u003C\u002Fli>\u003Cli>向量資料庫，例如 Pinecone、Weaviate、Chroma 或 pgvector\u003C\u002Fli>\u003Cli>文件來源，例如 PDF、Markdown、網頁或資料庫\u003C\u002Fli>\u003Cli>可讀取 [OpenAI docs](https:\u002F\u002Fplatform.openai.com\u002Fdocs) 與 [LangChain GitHub repo](https:\u002F\u002Fgithub.com\u002Flangchain-ai\u002Flangchain) 的權限，若你要沿用本文示例 stack\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Step 1: 準備文件語料\u003C\u002Fh2>\u003Cp>目的：建立一份可被檢索的可信資料來源。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777959047822-j4yr.png\" alt=\"5 步完成 RAG 管線\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>先把你要讓模型信任的內容收集起來。客服機器人可用產品文件與 FAQ；法務或醫療工具則只應使用已核准的內部資料。把內容轉成純文字、去重，並把長文件切成較小區塊，讓檢索器能回傳精準段落，而不是整頁內容。\u003C\u002Fp>\u003Cpre>\u003Ccode>from langchain_text_splitters import RecursiveCharacterTextSplitter\n\ntext_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=800,\n    chunk_overlap=120,\n)\nchunks = text_splitter.split_text(long_document_text)\nprint(len(chunks))\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到 chunk 數量大於 1，而且每個 chunk 都像完整段落。如果 chunk 太大，檢索會變吵；如果太小，回答可能失去上下文。\u003C\u002Fp>\u003Ch2>Step 2: 產生每段 embeddings\u003C\u002Fh2>\u003Cp>目的：把文字轉成能表達語意的向量，而不是只看關鍵字。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777959051895-tmvz.png\" alt=\"5 步完成 RAG 管線\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>把每個 chunk 送進 embedding 模型，讓語意相近的段落在向量空間中靠近。這是檢索品質的基礎。索引與查詢時請使用同一個 embedding 模型，否則相似度搜尋會變得不可靠。\u003C\u002Fp>\u003Cp>同時保存向量、原始 chunk 文字，以及 title、URL、section、timestamp 等 metadata。這些資訊能幫你追溯答案來源，也能依文件類型或新鮮度過濾結果。\u003C\u002Fp>\u003Cp>你應該看到每個 chunk 對應一個固定長度向量，通常是數字陣列或 float array。如果這一步失敗，先檢查文字是否為空，再確認模型維度是否符合向量資料庫需求。\u003C\u002Fp>\u003Ch2>Step 3: 建立向量索引\u003C\u002Fh2>\u003Cp>目的：讓知識庫可以用相似度快速搜尋。\u003C\u002Fp>\u003Cp>把 chunk embeddings 匯入向量資料庫，建立適合最近鄰搜尋的 index。這樣系統就能在毫秒內拿使用者問題對照已存內容，而不是掃描所有文件。若你的應用服務多個團隊，可加上 source、language、tenant 等篩選條件。\u003C\u002Fp>\u003Cp>實作流程是先把每個 chunk 的 vector、text 與 metadata 一起 upsert，接著確認 index 已可供查詢。若你用 pgvector，先建立 vector 欄位與 similarity index；若你用代管服務，先確認 namespace 或 collection 名稱正確再匯入。\u003C\u002Fp>\u003Cp>你應該看到索引中的 record 數量與已準備的 chunks 一致。做一個快速查詢後，回傳內容應該是最相關的段落，而不是隨機文字。\u003C\u002Fp>\u003Ch2>Step 4: 取回相關上下文\u003C\u002Fh2>\u003Cp>目的：在 \u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa> 寫答案前，先抓到最有支撐力的段落。\u003C\u002Fp>\u003Cp>當使用者提問時，先用同一個模型把問題轉成 embedding，再對向量索引做 similarity search。回傳 top-k chunks，通常是 3 到 8 段；若你更重視精準度，也可以再用 cross-encoder 或 LLM scorer 重新排序。\u003C\u002Fp>\u003Cp>要持續觀察檢索品質。如果第一名結果只算勉強相關，請改善 chunk 切法、加上 metadata filters，或補強文件語料。RAG 的品質關鍵在檢索，因為生成器只能依據拿到的內容回答。\u003C\u002Fp>\u003Cp>你應該看到一小串明顯符合使用者意圖的段落。如果段落離題，就算生成步驟很強，答案也很可能偏弱。\u003C\u002Fp>\u003Ch2>Step 5: 組合提示詞並生成答案\u003C\u002Fh2>\u003Cp>目的：輸出有依據、可追溯的回答。\u003C\u002Fp>\u003Cp>建立一個 prompt，把使用者問題、檢索到的 chunks，以及明確規則一起放進去，例如盡量只根據提供的 context 作答。接著把這個 prompt 送給 LLM，並要求它在資訊不足時直接說明，而不是自行補造事實。這能降低 hallucination，也讓系統更容易被信任。\u003C\u002Fp>\u003Cpre>\u003Ccode>prompt = f\"\"\"\nUse only the context below to answer the question.\nIf the context is insufficient, say so.\n\nContext:\n{retrieved_context}\n\nQuestion:\n{user_question}\n\"\"\"\n\nresponse = llm.invoke(prompt)\nprint(response.content)\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>你應該看到答案有反映檢索到的內容，最好還能提到來源或關鍵事實。請用一個明確有收錄在語料中的問題，和一個沒有收錄的問題測試；前者應該正確，後者應該禮貌地表示上下文不足。\u003C\u002Fp>\u003Ch2>常見錯誤\u003C\u002Fh2>\u003Cul>\u003Cli>查詢與文件使用了不同的 embedding 模型。修法：索引與檢索固定同一模型，若更換模型就重新嵌入整份語料。\u003C\u002Fli>\u003Cli>chunk 切太大或太小。修法：先從 500 到 1,000 字元加上 overlap 開始，再依檢索結果調整。\u003C\u002Fli>\u003Cli>略過資料更新。修法：加排程重建索引，或做增量更新，讓新文件與修訂內容能進入 vector store。\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>接下來可以看什麼\u003C\u002Fh2>\u003Cp>當基本管線跑通後，可以再加 citations、reranking、cache layer，以及針對答案忠實度與檢索召回率的評估測試。下一步也能把同一套模式延伸到 chat memory、tool use，或支援客服、搜尋與內部知識庫的專屬助理。\u003C\u002Fp>","這篇教你用 5 個步驟做出 RAG 管線，讓模型先檢索你的文件，再根據內容產生有依據的答案。","www.geeksforgeeks.org","https:\u002F\u002Fwww.geeksforgeeks.org\u002Fnlp\u002Fwhat-is-retrieval-augmented-generation-rag\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777959047822-j4yr.png",[13,14,15,16,17],"RAG","embeddings","vector database","LangChain","OpenAI API","zh",2,false,"2026-05-05T05:30:30.368078+00:00","2026-05-05T05:30:30.176+00:00","done","b650511e-86f0-4505-9a39-e6f94a51f16e","how-to-build-a-rag-pipeline-in-5-steps-zh","ai-agent","95ec8193-dee3-4ec5-93db-89f285d07612","published","2026-05-05T09:00:17.724+00:00",[31,32,33],"先把文件切成可檢索的 chunk，再做 embeddings 與向量索引。","查詢時用同一個 embedding 模型找回 top-k 上下文，最後再交給 LLM 生成。","用 chunk 大小、metadata filters、重新索引與測試題組來維持品質與新鮮度。",[35,37,39,41,42],{"name":13,"slug":36},"rag",{"name":17,"slug":38},"openai-api",{"name":16,"slug":40},"langchain",{"name":14,"slug":14},{"name":15,"slug":43},"vector-database",{"id":27,"slug":45,"title":46,"language":47},"how-to-build-a-rag-pipeline-in-5-steps-en","How to Build a RAG Pipeline in 5 Steps","en",[49,55,61,67,73,79],{"id":50,"slug":51,"title":52,"cover_image":53,"image_url":53,"created_at":54,"category":26},"38406a12-f833-4c69-ae22-99c31f03dd52","switch-ai-outputs-markdown-to-html-zh","怎麼把 AI 輸出改成 HTML","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778743243861-8901.png","2026-05-14T07:20:21.545364+00:00",{"id":56,"slug":57,"title":58,"cover_image":59,"image_url":59,"created_at":60,"category":26},"c7c69fe4-97e3-4edf-a9d6-a79d0c4495b4","anthropic-cat-wu-proactive-ai-assistants-zh","Cat Wu 談 Claude 的主動式 AI","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778735455993-gnw7.png","2026-05-14T05:10:30.453046+00:00",{"id":62,"slug":63,"title":64,"cover_image":65,"image_url":65,"created_at":66,"category":26},"e1d6acda-fa49-4514-aa75-709504be9f93","how-to-run-hermes-agent-on-discord-zh","如何在 Discord 執行 Hermes Agent","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778724655796-cjul.png","2026-05-14T02:10:34.362605+00:00",{"id":68,"slug":69,"title":70,"cover_image":71,"image_url":71,"created_at":72,"category":26},"4104fa5f-d95f-45c5-9032-99416cf0365c","why-ragflow-is-the-right-open-source-rag-engine-to-self-host-zh","為什麼 RAGFlow 是最適合自架的開源 RAG 引擎","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778674262278-1630.png","2026-05-13T12:10:23.762632+00:00",{"id":74,"slug":75,"title":76,"cover_image":77,"image_url":77,"created_at":78,"category":26},"7095f05c-34f5-469f-a044-2525d2010ce9","how-to-add-temporal-rag-in-production-zh","如何在正式環境加入 Temporal RAG","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778667053844-osvs.png","2026-05-13T10:10:30.930982+00:00",{"id":80,"slug":81,"title":82,"cover_image":83,"image_url":83,"created_at":84,"category":26},"10479c95-53c6-4723-9aaa-2fde5fb19ee7","github-agentic-workflows-ai-github-actions-zh","GitHub 把 AI 代理放進 Actions","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778551884342-8io7.png","2026-05-12T02:11:02.069769+00:00",[86,91,96,101,106,111,116,121,126,131],{"id":87,"slug":88,"title":89,"created_at":90},"4ae1e197-1d3d-4233-8733-eafe9cb6438b","claude-now-uses-your-pc-to-finish-tasks-zh","Claude 開始幫你操作電腦","2026-03-26T07:20:48.457387+00:00",{"id":92,"slug":93,"title":94,"created_at":95},"5bede67f-e21c-413d-9ab8-54a3c3d26227","googles-2026-ai-agent-report-decoded-zh","Google 2026 AI Agent 報告解讀","2026-03-26T11:15:22.651956+00:00",{"id":97,"slug":98,"title":99,"created_at":100},"2987d097-563f-46c7-b76f-b558d8ef7c2b","kimi-k25-review-stronger-still-not-legend-zh","Kimi K2.5 評測：更強，但還不是神作","2026-03-27T07:15:55.277513+00:00",{"id":102,"slug":103,"title":104,"created_at":105},"95c9053b-e3f4-4cb5-aace-5c54f4c9e044","claude-code-controls-mac-desktop-zh","Claude Code 也能操控 Mac 了","2026-03-28T03:01:58.58121+00:00",{"id":107,"slug":108,"title":109,"created_at":110},"dc58e153-e3a8-4c06-9b96-1aa64eabbf5f","cloudflare-100x-faster-ai-agent-sandbox-zh","Cloudflare 的 AI 沙箱跑超快","2026-03-28T03:09:44.142236+00:00",{"id":112,"slug":113,"title":114,"created_at":115},"1c8afc56-253f-47a2-979f-1065ff072f2a","openai-backs-isara-agent-swarm-bet-zh","OpenAI 挺 Isara 的 agent swarm …","2026-03-28T03:15:27.513155+00:00",{"id":117,"slug":118,"title":119,"created_at":120},"7379b422-576e-45df-ad5a-d57a0d9dd467","openai-plan-automated-ai-researcher-zh","OpenAI 想做自動化 AI 研究員","2026-03-28T03:17:42.090548+00:00",{"id":122,"slug":123,"title":124,"created_at":125},"48c9889e-86df-450b-a356-e4a4b7c83c5b","harness-engineering-ai-agent-reliability-2026-zh","駕馭工程：從「馬具」到「作業系統」，AI Agent 可靠性的終極密碼","2026-03-31T06:42:53.556721+00:00",{"id":127,"slug":128,"title":129,"created_at":130},"e41546b8-ba9e-455f-9159-88d4614ad711","openai-codex-plugin-claude-code-zh","OpenAI 把 Codex 放進 Claude Code","2026-04-01T09:21:54.687617+00:00",{"id":132,"slug":133,"title":134,"created_at":135},"96d8e8c8-1edd-475d-9145-b1e7a1b02b65","mcp-explained-from-prompts-to-production-zh","MCP 怎麼把提示詞變工作流","2026-04-01T09:24:39.321274+00:00"]