[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-rag-precision-tuning-hurts-retrieval-accuracy-zh":3,"tags-rag-precision-tuning-hurts-retrieval-accuracy-zh":39,"related-lang-rag-precision-tuning-hurts-retrieval-accuracy-zh":49,"related-posts-rag-precision-tuning-hurts-retrieval-accuracy-zh":53,"series-research-f138a001-0992-4842-9a06-325d30fc6004":90},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":22,"translated_content":10,"views":23,"is_premium":24,"created_at":25,"updated_at":25,"cover_image":11,"published_at":26,"rewrite_status":27,"rewrite_error":10,"rewritten_from_id":28,"slug":29,"category":30,"related_article_id":31,"status":32,"google_indexed_at":33,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":34,"topic_cluster_id":10,"embedding":10,"is_canonical_seed":24},"f138a001-0992-4842-9a06-325d30fc6004","RAG 精準調校反而害檢索","\u003Cp data-speakable=\"summary\">Redis 的研究指出，\u003Ca href=\"\u002Ftag\u002Frag\">RAG\u003C\u002Fa> embed\u003Ca href=\"\u002Fnews\u002Fwhy-ai-coding-agents-need-an-architecture-compiler-zh\">ding\u003C\u002Fa> 若只追求 precision，檢索準確率可能掉到 40%。\u003C\u002Fp>\u003Cp>說真的，這結果很刺耳。很多團隊都想把 RAG 調得更準，結果可能把自己調進坑裡。\u003C\u002Fp>\u003Cp>這篇在講一件很實際的事。\u003Ca href=\"https:\u002F\u002Fredis.io\u002F\" target=\"_blank\" rel=\"noopener\">Redis\u003C\u002Fa> 的研究筆記提醒，精準度拉高，不代表檢索就更好。對 \u003Ca href=\"https:\u002F\u002Fwww.langchain.com\u002F\" target=\"_blank\" rel=\"noopener\">LangChain\u003C\u002Fa> 這類 agentic pipeline 來說，前面檢索一歪，後面整串都會歪。\u003C\u002Fp>\u003Cp>先看數字。研究提到，檢索準確率最多會掉 40%。這不是小誤差，是會直接改變產品行為的那種。\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>指標\u003C\u002Fth>\u003Cth>數值\u003C\u002Fth>\u003Cth>意思\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>檢索準確率下滑\u003C\u002Ftd>\u003Ctd>最高 40%\u003C\u002Ftd>\u003Ctd>代表調校後，實際找回正確資料的能力可能明顯變差\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>優化目標\u003C\u002Ftd>\u003Ctd>Precision\u003C\u002Ftd>\u003Ctd>會讓相似匹配更嚴，但也可能縮小可檢索範圍\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>風險區\u003C\u002Ftd>\u003Ctd>Agentic pipelines\u003C\u002Ftd>\u003Ctd>代理流程很吃前段檢索品質，前面錯了後面很難救\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>常見後果\u003C\u002Ftd>\u003Ctd>Recall 下降\u003C\u002Ftd>\u003Ctd>真正能回答問題的文件，可能被排除在外\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>為什麼只追 precision 會出事\u003C\u002Fh2>\u003Cp>RAG 的核心，不是找最像的句子而已。它要找的是，對答案真的有用的資料。這兩件事常常不是同一件事。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778055657010-r5a0.png\" alt=\"RAG 精準調校反而害檢索\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>你把 embedding 調得太偏 precision，模型會變得很挑。它可能更愛抓近似片段，也更容易把邊界案例、補充文件、背景脈絡排掉。對使用者來說，結果就是少了關鍵上下文。\u003C\u002Fp>\u003Cp>講白了，檢索不是單一分數遊戲。你在 benchmark 上拿到漂亮數字，不代表真實工作流會更順。客服問答、內部知識庫、研究助理，最怕的就是「看起來像對的」，但真正答案沒被撈出來。\u003C\u002Fp>\u003Cul>\u003Cli>Precision 高，不代表 Recall 也高。\u003C\u002Fli>\u003Cli>候選文件變少，漏掉答案的機率就上升。\u003C\u002Fli>\u003Cli>Agent 先吃到爛上下文，後面工具全跟著失真。\u003C\u002Fli>\u003Cli>企業查詢很雜，最有用的文件常常不像問題本身。\u003C\u002Fli>\u003C\u002Ful>\u003Cp>這也是我覺得最麻煩的地方。團隊常常先看一個數字，然後就覺得自己做對了。可是在真實場景，錯的不是模型分數，是整個產品體驗。\u003C\u002Fp>\u003Ch2>Redis 在提醒什麼\u003C\u002Fh2>\u003Cp>\u003Ca href=\"https:\u002F\u002Fredis.io\u002F\" target=\"_blank\" rel=\"noopener\">Redis\u003C\u002Fa> 這幾年一直往 \u003Ca href=\"\u002Ftag\u002Fai-\">AI 基礎設施\u003C\u002Fa>走。它做 vector search、cache、\u003Ca href=\"\u002Ftag\u002Fagent\">agent\u003C\u002Fa> memory，都很貼近實戰。這次的提醒很直白：embedding 層的 precision 變好，不等於 production 裡的 retrieval 變好。\u003C\u002Fp>\u003Cp>這句話對做 agent 的團隊特別重要。因為 agent 的第一步，通常就是先抓資料。前面抓到的 context 如果太窄，或太偏近似樣本，agent 就會帶著偏差做決策。\u003C\u002Fp>\u003Cp>你可能會想問，那是不是把 precision 放掉就好？也不是。問題不是 precision 本身，而是你把它當成唯一目標。RAG 需要的是平衡，不是單點最優。\u003C\u002Fp>\u003Cblockquote>“There is no free lunch in machine learning,” said \u003Ca href=\"https:\u002F\u002Ftwitter.com\u002Fkarpathy\" target=\"_blank\" rel=\"noopener\">Andrej Karpathy\u003C\u002Fa>.\u003C\u002Fblockquote>\u003Cp>這句老話放在這裡很貼。你在一邊拿到好看分數，通常要在另一邊付代價。RAG 裡常見的代價，就是 Recall、grounding，還有可用上下文的廣度。\u003C\u002Fp>\u003Cp>如果你的產品是知識助理，這種代價會很痛。使用者不會在乎你的 embedding loss 怎麼降，他只會在乎「為什麼明明有文件，系統卻找不到」。\u003C\u002Fp>\u003Ch2>跟常見 RAG 做法比起來差在哪\u003C\u002Fh2>\u003Cp>多數團隊做 RAG，會先看 chunking、reranking、embedding model，再看 prompt。這個順序沒錯，但很多人會誤以為只要把檢索模型調更準，整體就會更好。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778055670852-qn3f.png\" alt=\"RAG 精準調校反而害檢索\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Redis 這份研究的重點，就是打臉這種直覺。你在實驗室裡把一個指標拉高，不代表線上體驗會一起上去。反過來說，還可能把整條鏈弄壞。\u003C\u002Fp>\u003Cp>我整理成幾個常見路線，差異很明顯：\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>Precision-first：\u003C\u002Fstrong> 匹配更緊，候選更少，漏掉有用資料的風險更高。\u003C\u002Fli>\u003Cli>\u003Cstrong>Recall-aware：\u003C\u002Fstrong> 找回更多上下文，但後面 rerank 和 filter 要更認真。\u003C\u002Fli>\u003Cli>\u003Cstrong>Production-first：\u003C\u002Fstrong> 看真實 query、人工抽查、再搭配線上 A\u002FB test。\u003C\u002Fli>\u003Cli>\u003Cstrong>Agent-first：\u003C\u002Fstrong> 先看檢索是否能支撐任務，不只看 similarity 分數。\u003C\u002Fli>\u003C\u002Ful>\u003Cp>這裡還有一個很現實的問題。很多團隊拿 synthetic benchmark 當真相，但真實使用者的問題很髒。有人會打半句話，有人會混中英，有人會問很冷門的例外情況。\u003C\u002Fp>\u003Cp>所以，真正該比的不是誰的分數漂亮，而是誰比較少漏答案。這一點，\u003Ca href=\"https:\u002F\u002Fwww.pinecone.io\u002F\" target=\"_blank\" rel=\"noopener\">Pinecone\u003C\u002Fa>、\u003Ca href=\"https:\u002F\u002Fweaviate.io\u002F\" target=\"_blank\" rel=\"noopener\">Weaviate\u003C\u002Fa> 這類向量資料庫幫得上忙，但它們救不了錯的優化方向。\u003C\u002Fp>\u003Ch2>這件事放到產業脈絡裡看\u003C\u002Fh2>\u003Cp>RAG 現在已經不是 demo 技術了。很多公司拿它做客服、法務\u003Ca href=\"\u002Fnews\u002Fllm-overview-manipulation-biases-zh\">搜尋\u003C\u002Fa>、銷售知識庫，甚至內部 agent。這些場景共通點很簡單：資料多，問題雜，錯一次就很煩。\u003C\u002Fp>\u003Cp>也因為這樣，檢索層的微小變動，會比一般人想的更敏感。你把模型調得太保守，系統就只會撈到最像的文件。你把模型調得太寬，又會把垃圾上下文塞進去。\u003C\u002Fp>\u003Cp>這就是 RAG 的老問題。不是找不到模型，而是找不到剛剛好的平衡點。很多時候，chunk 策略、reranker、metadata filter，甚至資料清洗，影響都比單純換 embedding 還大。\u003C\u002Fp>\u003Cp>再看 agentic pipeline，就更明顯了。\u003Ca href=\"\u002Fnews\u002Fagentic-ai-moving-past-rag-knowledge-layer-zh\">Agen\u003C\u002Fa>t 不是只回答一句話而已。它可能要查資料、比對條件、再呼叫 \u003Ca href=\"\u002Ftag\u002Fapi\">API\u003C\u002Fa>。前面檢索若偏掉，後面每一步都會跟著偏。\u003C\u002Fp>\u003Cp>所以我會建議團隊把測試重點換掉。不要只問「precision 有沒有變高」，而是問「正確文件有沒有更常被找回」。這兩題差很多。\u003C\u002Fp>\u003Ch2>團隊接下來該怎麼做\u003C\u002Fh2>\u003Cp>第一步，別只看單一指標。Precision、Recall、MRR、nDCG，最好一起看。只盯一個數字，很容易把系統調歪。\u003C\u002Fp>\u003Cp>第二步，拿真實 query 測。不要只用乾淨的測試集。要把使用者真的會丟的問題拿進來，包含模糊問法、短句、錯字、混合語言。\u003C\u002Fp>\u003Cp>第三步，檢查下游任務。你的 RAG 是拿來回答問題，還是拿來餵 agent 做決策？如果是後者，檢索品質的容錯率更低。\u003C\u002Fp>\u003Cp>第四步，別迷信 embedding。chunking、reranking、metadata、query rewrite，常常比你想像中更有用。很多時候，修資料比修模型便宜，也更快。\u003C\u002Fp>\u003Cp>如果你現在在調 RAG，我會直接問一句：你要的是更像，還是更對？兩者不是同一件事。搞清楚這點，才不會把系統越調越窄。\u003C\u002Fp>\u003Cp>我的預測很直接。接下來一年，更多團隊會發現，RAG 的瓶頸不在模型多大，而在檢索策略有沒有配對好任務。先把 Recall、真實任務成功率、agent 完成率一起納入，再談 precision，會比較實在。\u003C\u002Fp>","Redis 研究指出，RAG embedding 若只追求 precision，檢索準確率可能掉 40%，還會拖累 agentic pipeline。","venturebeat.com","https:\u002F\u002Fventurebeat.com\u002Fdata\u002Frag-precision-tuning-can-quietly-cut-retrieval-accuracy-by-40-putting-agentic-pipelines-at-risk",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778055657010-r5a0.png",[13,14,15,16,17,18,19,20,21],"RAG","retrieval accuracy","precision tuning","embedding","agentic pipelines","Redis","LangChain","Recall","vector search","zh",1,false,"2026-05-06T08:20:36.321486+00:00","2026-05-06T08:20:36.173+00:00","done","7f3cdba3-c5aa-4fcd-97cd-62a417837173","rag-precision-tuning-hurts-retrieval-accuracy-zh","research","ea29007f-e989-470f-8968-68b7111caa88","published","2026-05-06T09:00:20.111+00:00",[35,36,37,38],"只追 precision，可能讓 RAG 檢索準確率掉到 40%。","RAG 的目標是找對資料，不是只找最像的資料。","Agentic pipeline 對前段檢索很敏感，錯一次會一路傳下去。","實務上要同時看 precision、recall 和真實 query 表現。",[40,42,44,46,47],{"name":14,"slug":41},"retrieval-accuracy",{"name":13,"slug":43},"rag",{"name":15,"slug":45},"precision-tuning",{"name":16,"slug":16},{"name":17,"slug":48},"agentic-pipelines",{"id":31,"slug":50,"title":51,"language":52},"rag-precision-tuning-hurts-retrieval-accuracy-en","RAG precision tuning can hurt retrieval accuracy","en",[54,60,66,72,78,84],{"id":55,"slug":56,"title":57,"cover_image":58,"image_url":58,"created_at":59,"category":30},"667b72b6-e821-4d68-80a1-e03340bc85f1","turboquant-seo-shift-small-sites-zh","TurboQuant 與小站 SEO 變化","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778840440690-kcw9.png","2026-05-15T10:20:27.319472+00:00",{"id":61,"slug":62,"title":63,"cover_image":64,"image_url":64,"created_at":65,"category":30},"381fb6c6-6da7-4444-831f-8c5eed8d685c","turboquant-vllm-comparison-fp8-kv-cache-zh","TurboQuant 與 FP8 實測結果","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778839867551-4v9g.png","2026-05-15T10:10:36.034569+00:00",{"id":67,"slug":68,"title":69,"cover_image":70,"image_url":70,"created_at":71,"category":30},"c15f45ee-a548-4dbf-8152-91de159c1a11","llmbda-calculus-agent-safety-rules-zh","LLMbda 演算替 AI 代理人立安全規則","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778825503412-mlbf.png","2026-05-15T06:10:34.832664+00:00",{"id":73,"slug":74,"title":75,"cover_image":76,"image_url":76,"created_at":77,"category":30},"0c02225c-d6ff-44f8-bc92-884c8921c4a3","low-complexity-beamspace-denoiser-mmwave-mimo-zh","更簡單的毫米波波束域去噪器","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778814650361-xtc2.png","2026-05-15T03:10:30.06639+00:00",{"id":79,"slug":80,"title":81,"cover_image":82,"image_url":82,"created_at":83,"category":30},"9d27f967-62cc-433f-8cdb-9300937ade13","ai-benchmark-wins-cyber-scare-defenders-zh","為什麼 AI 基準賽在資安領域的勝利，應該讓防守方警醒","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778807450006-nofx.png","2026-05-15T01:10:29.379041+00:00",{"id":85,"slug":86,"title":87,"cover_image":88,"image_url":88,"created_at":89,"category":30},"bc402dc6-5da6-46fc-9d66-d09cb215f72b","why-linux-security-needs-patch-wave-mindset-zh","為什麼 Linux 安全需要「補丁浪潮」思維","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778741449813-s2wn.png","2026-05-14T06:50:24.052583+00:00",[91,96,101,106,111,116,121,126,131,136],{"id":92,"slug":93,"title":94,"created_at":95},"f18dbadb-8c59-4723-84a4-6ad22746c77a","deepmind-bets-on-continuous-learning-ai-2026-zh","DeepMind 押注 2026 連續學習 AI","2026-03-26T08:16:02.367355+00:00",{"id":97,"slug":98,"title":99,"created_at":100},"f4a106cb-02a6-4508-8f39-9720a0a93cee","ml-papers-of-the-week-github-research-desk-zh","每週 ML 論文清單，為何紅到 GitHub","2026-03-27T01:11:39.284175+00:00",{"id":102,"slug":103,"title":104,"created_at":105},"c4f807ca-4e5f-47f1-a48c-961cf3fc44dc","ai-ml-conferences-to-watch-in-2026-zh","2026 AI 研討會投稿時程整理","2026-03-27T01:51:53.874432+00:00",{"id":107,"slug":108,"title":109,"created_at":110},"9f50561b-aebd-46ba-94a8-363198aa7091","openclaw-agents-manipulated-self-sabotage-zh","OpenClaw Agent 會自己搞砸自己","2026-03-28T03:03:18.786425+00:00",{"id":112,"slug":113,"title":114,"created_at":115},"11f22e92-7066-4978-a544-31f5f2156ec6","vega-learning-to-drive-with-natural-language-instructions-zh","Vega：使用自然語言指示進行自駕車控制","2026-03-28T14:54:04.847912+00:00",{"id":117,"slug":118,"title":119,"created_at":120},"a4c7cfec-8d0e-4fec-93cf-1b9699a530b8","drive-my-way-en-zh","Drive My Way：個性化自駕車風格的實現","2026-03-28T14:54:26.207495+00:00",{"id":122,"slug":123,"title":124,"created_at":125},"dec02f89-fd39-41ba-8e4d-11ede93a536d","training-knowledge-bases-with-writeback-rag-zh","用 WriteBack-RAG 強化知識庫提升檢索效能","2026-03-28T14:54:45.775606+00:00",{"id":127,"slug":128,"title":129,"created_at":130},"3886be5c-a137-40cc-b9e2-0bf18430c002","packforcing-efficient-long-video-generation-method-zh","PackForcing：短影片訓練也能生成長影片","2026-03-28T14:55:02.688141+00:00",{"id":132,"slug":133,"title":134,"created_at":135},"72b90667-d930-4cc9-8ced-aaa0f8968d44","pixelsmile-toward-fine-grained-facial-expression-editing-zh","PixelSmile：提升精細臉部表情編輯的新方法","2026-03-28T14:55:20.678181+00:00",{"id":137,"slug":138,"title":139,"created_at":140},"cf046742-efb2-4753-aef9-caed5da5e32e","adaptive-block-scaled-data-types-zh","IF4：神經網路量化的聰明選擇","2026-03-31T06:00:36.990273+00:00"]