[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-weak-rewards-persistent-llm-user-models-zh":3,"article-related-weak-rewards-persistent-llm-user-models-zh":36,"series-research-492aa1ec-02ce-491e-ad03-ae804f261f87":88},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":30,"topic_cluster_id":34,"embedding":35,"is_canonical_seed":20},"492aa1ec-02ce-491e-ad03-ae804f261f87","弱回饋讓 LLM 記住偏好","\u003Cp data-speakable=\"summary\">這篇論文主張，可從檢索增強互動中抽出弱回饋，來建立可持續的使用者偏好模型。\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>研究機構\u003C\u002Fstrong>：arXiv 摘要未明確標註\u003C\u002Fli>\u003Cli>\u003Cstrong>核心數據\u003C\u002Fstrong>：摘要無公開 benchmark 數字\u003C\u002Fli>\u003Cli>\u003Cstrong>突破點\u003C\u002Fstrong>：弱回饋做偏好建模\u003C\u002Fli>\u003C\u002Ful>\u003Cp>大型語言模型現在很會聊天，但很多產品還是有一個老問題：它記不住你是誰、你喜歡什麼、你常用哪種說法。這篇論文就是在處理這個痛點。它不是要把模型變成全知全能，而是想讓聊天助理有一個更持久的使用者偏好模型。\u003C\u002Fp>\u003Cp>這件事看起來小，實際上很關鍵。因為一旦助理每次都忘記前文，使用者就得反覆重講需求。對產品體驗來說，這會直接破壞連續感。對工程團隊來說，這也代表你要花更多成本去做提示詞補丁、額外記憶層，或人工標註流程。\u003C\u002Fp>\u003Cp>這篇摘要提供的方向很明確：不要等完美標籤，\u003Ca href=\"\u002Fnews\u002Futah-jazz-2026-roster-injury-report-stats-zh\">先從\u003C\u002Fa>真實互動裡找可用的訓練訊號。它的核心做法，是把檢索增強互動中的弱回饋拿來當偏好學習的依據。換句話說，作者想從平常的對話行為裡，推回使用者到底偏好什麼。\u003C\u002Fp>\u003Ch2>這篇在解什麼痛點\u003C\u002Fh2>\u003Cp>摘要直接點出問題：\u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa> 越來越常被拿來做個人助理，但多數系統沒有持續性的使用者模型。模型可以在單輪回答得不錯，卻很難把「這位使用者偏好什麼」延續到下一次對話。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779084838002-5od2.png\" alt=\"弱回饋讓 LLM 記住偏好\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>這不是單純的模型能力問題，而是產品層級的記憶問題。使用者如果每次都要重新說明偏好，像是語氣、格式、限制條件或工作習慣，助理就很難真的像「個人助理」。\u003C\u002Fp>\u003Cp>所以這篇論文的目標其實很務實。它不是泛泛地說要讓模型更聰明，而是聚焦在偏好建模。這個切法很重要，因為一旦你能把偏好顯式存下來，後面不管是檢索、排序、回覆選擇，還是後續對話行為，都有機會拿這份狀態來調整。\u003C\u002Fp>\u003Ch2>方法的重點在哪裡\u003C\u002Fh2>\u003Cp>這篇摘要最關鍵的詞是「weak rewards from retrieval-augmented interaction」。這表示作者不依賴乾淨、完整、人工標好的偏好資料，而是想從互動過程中抽出比較弱、比較吵，但仍然有用的回饋訊號。\u003C\u002Fp>\u003Cp>白話一點說，檢索增強互動就是助理在對話時，不只自己生成文字，還會先從外部資訊源抓一些內容進來。作者想觀察使用者在這個流程裡怎麼反應，再把這些反應轉成偏好訊號。摘要沒有把完整管線講開，所以我們不能補成某種特定架構；只能確定它用的是弱回饋，而且回饋來源和檢索增強互動有關。\u003C\u002Fp>\u003Cp>這種思路的吸引力很直接。真實產品裡，使用者很少會乖乖給你標籤，但他們會接受、忽略、修改或拒絕系統給的內容。這些行為雖然不乾淨，卻可能比人工問卷更接近真實偏好。\u003C\u002Fp>\u003Cp>對開發者來說，這代表訓練訊號不一定要來自昂貴的標註流程。只要產品本身有檢索、有互動，就有機會把日常使用痕跡變成學習資料。當然，前提是你能把這些訊號整理得夠穩定。\u003C\u002Fp>\u003Ch2>論文證明了什麼\u003C\u002Fh2>\u003Cp>就目前提供的摘要內容來看，這篇沒有公開完整 benchmark 細節。沒有數字，沒有資料集名稱，也沒有明確的評估指標，所以不能直接說它提升了多少準確率、偏好預測分數或延遲表現。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779084840561-g5pa.png\" alt=\"弱回饋讓 LLM 記住偏好\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>但摘要還是證明了一件事：作者把問題定義得很清楚，而且提出了一條可行的訓練方向。也就是說，持續性的使用者建模可以透過檢索增強互動中的弱回饋來做，而不一定非得依賴強標註。\u003C\u002Fp>\u003Cp>這種層級的貢獻比較像研究方向的打開，\u003Ca href=\"\u002Fnews\u002Fwei-shi-mo-minimax-geng-xiang-xiao-fei-ji-ai-gong-si-er-bu-s-zh\">而不是\u003C\u002Fa>一個已經被數據完全驗證的結論。對讀者來說，這很重要，因為它提醒我們：摘要目前能支持的是方法論上的可行性，不是性能上的最終勝利。\u003C\u002Fp>\u003Cp>如果你習慣看論文先找 benchmark，這篇的資訊密度就沒那麼高。它更像是一個問題設定加上一個方法主張。真正的效果、泛化能力、以及是否能跨場景成立，還得看完整論文的實驗章節。\u003C\u002Fp>\u003Ch2>對開發者有什麼實際影響\u003C\u002Fh2>\u003Cp>如果你在做聊天助理、\u003Ca href=\"\u002Ftag\u002Fcopilot\">copilot\u003C\u002Fa>，或任何會重複使用的對話產品，偏好持久化其實是高槓桿功能。它能減少使用者重複輸入，提升連續性，也能讓系統看起來更懂人，而不是每次都像第一次見面。\u003C\u002Fp>\u003Cp>這篇論文真正值得注意的地方，是它把「個人化」拉回到可部署的資料問題。強監督很貴，標註很慢，但如果弱回饋能從自然互動中長出來，團隊就有機會用產品流量本身來累積個人化能力，而不是另外開一條人工標註管線。\u003C\u002Fp>\u003Cp>這對\u003Ca href=\"\u002Fnews\u002Fmarlin-greener-llm-inference-datacenters-zh\">資源\u003C\u002Fa>有限的團隊尤其有吸引力。因為你通常不會有足夠的人力去問每個使用者完整偏好，也不可能每次對話都做精細標註。弱回饋的價值就在這裡：它不完美，但可能夠用，而且更接近真實世界的資料流。\u003C\u002Fp>\u003Cp>不過，這種方法也不是沒有代價。弱訊號通常比較吵，還會受檢索品質影響。如果檢索層抓錯內容，後面學到的偏好就可能跟著歪掉。也就是說，檢索不是配角，而是整個偏好建模流程的一部分。\u003C\u002Fp>\u003Ch2>限制和還沒回答的問題\u003C\u002Fh2>\u003Cp>最大的限制很直接：摘要沒有把方法細節講完整。你看不到弱回饋怎麼定義、檢索怎麼接、模型怎麼訓練，也看不到實驗設計。這代表目前沒辦法嚴格評估它的效果。\u003C\u002Fp>\u003Cp>另一個問題是偏好會變。使用者今天喜歡簡短，明天可能想要完整解釋；今天想要正式，明天可能只想要白話。摘要沒有說明系統怎麼處理偏好漂移，也沒有交代長短期訊號衝突時怎麼辦。\u003C\u002Fp>\u003Cp>還有一個實作上的風險，是持久化偏好可能會把舊假設鎖太久。助理如果太相信過去，反而可能忽略現在。這在個人化系統裡很常見，也正是持續性記憶最難的地方。\u003C\u002Fp>\u003Cp>所以，這篇摘要比較像是在提出一個有潛力的方向，而不是交出一個已經封裝好的解法。它告訴你「可以從哪裡拿訊號」，但沒有回答「訊號到底有多穩」、「模型能不能泛化」、「使用者能不能控制記憶」這些更接近產品落地的問題。\u003C\u002Fp>\u003Ch2>總結來看\u003C\u002Fh2>\u003Cp>這篇論文的核心主張很清楚：聊天助理要記住使用者偏好，不一定要靠強標註；可以試著從檢索增強互動中抽出弱回饋，來建立持續性的使用者模型。\u003C\u002Fp>\u003Cp>對研究來說，這是把個人化問題往可取得資料的方向推了一步。對開發者來說，這是提醒你，助理的記憶不一定要等到完美資料才做，現場互動本身就可能是訓練來源。\u003C\u002Fp>\u003Cp>但就這份摘要而言，最誠實的結論還是：它提出了方法方向，沒有公開完整 benchmark 數字。要判斷這招到底有多有效，還需要看完整論文的實驗與實作細節。\u003C\u002Fp>\u003Cul>\u003Cli>這篇把「記住使用者偏好」當成核心問題。\u003C\u002Fli>\u003Cli>它主打從檢索增強互動抽弱回饋，而不是靠強標註。\u003C\u002Fli>\u003Cli>目前摘要沒有公開 benchmark 數字，效果還不能下定論。\u003C\u002Fli>\u003C\u002Ful>","這篇論文主張，可從檢索增強互動中抽出弱回饋，來建立可持續的使用者偏好模型。","arxiv.org","https:\u002F\u002Farxiv.org\u002Fabs\u002F2603.20939",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779084838002-5od2.png",[13,14,15,16,17],"LLM user modeling","weak rewards","retrieval-augmented interaction","personalization","preference modeling","zh",1,false,"2026-05-18T06:13:32.906335+00:00","2026-05-18T06:13:32.76+00:00","done","a55b8fbf-95c3-478a-b2d9-ba5f09ede03d","weak-rewards-persistent-llm-user-models-zh","research","3ab37b54-e52b-4118-a61b-b594973b3aa4","published","2026-05-18T09:00:28.493+00:00",[31,32,33],"摘要主張可用檢索增強互動中的弱回饋，來做持續性的使用者偏好建模。","這篇的重點是方法方向，不是 benchmark 成績；摘要沒有公開完整數字。","對開發者來說，價值在於把個人化資料來源拉回真實互動，而非昂貴標註。","0c35a120-52fc-41fc-afa3-d404eb934158","[-0.013032443,0.024254575,0.019920962,-0.08186606,-0.010533319,-0.031678535,-0.026425991,-0.017500062,0.0050676325,0.0041586505,0.018516442,-0.024339436,0.0016837795,-0.008922671,0.13418463,0.03762445,-0.0018335276,0.008759015,0.010823287,-0.035915658,0.009722131,-2.6543253e-06,-0.018535024,-0.017273393,0.011228401,0.013075411,0.02350371,0.02267108,0.036035825,-0.013169641,0.0023675617,0.02898771,0.026197635,0.049890336,0.014625993,0.025266489,0.020742515,-0.01864849,0.018318124,0.017413875,0.0035835395,-0.0015634667,0.022654083,4.5957964e-05,-0.00862007,0.019410942,0.006569194,-0.033752885,-0.008912648,0.015109509,0.0036195214,0.015795534,-0.0058716717,-0.15598944,0.017400872,0.010513985,0.007952659,-0.0032976936,0.010103197,0.018745609,-0.038773317,0.027201712,-0.019959228,-0.0072579645,0.016002854,-0.01877107,0.036396783,0.009815723,0.008141046,-0.00038509784,0.016668895,-0.010479796,0.028080463,-0.032706417,0.008712828,-0.05305038,0.0063975127,0.011141576,0.004336053,0.016239932,0.029026002,-0.013697937,-0.018037975,0.017398113,0.0030752255,0.016313283,0.012210097,0.0044057043,0.03194102,0.016425146,-0.006476395,-0.0028461015,0.016185287,0.01992812,-0.0013938318,-0.0042191227,-0.014669149,0.00949237,0.009595441,-0.027006993,-0.029859029,-0.021330517,-0.011119075,0.0024911528,0.01731107,0.008300034,0.010907212,-0.028168047,-0.005072345,0.03438589,-0.01988845,0.0041046357,0.0033313946,-0.017696941,-0.008577922,-0.13087983,0.009954272,0.015203533,-0.018050652,-0.0064601884,0.007916617,0.004275845,-0.004810331,0.029612431,0.0003604748,0.0077648633,0.008557439,-0.031819653,-0.0056573492,0.015297349,-0.029359406,0.018806536,0.018372219,-0.015780205,0.00640186,0.028589144,0.0085903155,-0.021272933,-0.012992702,-0.009027976,-0.013640915,0.03161807,-0.0031236142,-0.012939334,-0.014138979,0.0038031281,-0.030426282,0.0027210668,0.01773598,8.994363e-05,0.028175313,0.018361581,-0.010287642,-0.0040268567,0.015057221,-0.007844327,0.004239892,0.014552544,0.0055774054,0.039747372,0.007959305,-0.001691733,0.009815595,0.004541779,-0.012874188,0.022345627,-0.015352402,-0.015856303,-0.0052568647,0.027395848,-0.008247728,-0.009612949,0.0006802464,-0.009597345,0.030019758,0.012570313,0.0041163373,-0.013918723,0.0017532398,-0.012818163,0.030375296,-0.010572711,-0.005865714,0.016184941,-0.013818797,0.00995529,0.004298907,0.010050188,0.016958443,0.022003355,-0.03813525,-0.002155903,0.021480061,-0.009801675,0.0023014129,-0.010525501,-0.0053517995,0.009169652,-0.0030688455,0.020588486,-0.002526334,0.0054678135,0.0034052697,-0.011075612,0.04444214,-0.0039558727,-0.009061133,0.0047122296,0.033291753,0.014651405,-0.012049678,0.014347291,-0.0010893419,-0.016358245,0.008748499,-0.01079453,-0.0004888124,-0.03323648,-0.00018337947,-0.0131763695,0.031255867,-0.009386349,0.036470342,0.020671202,-0.032817848,0.0011179309,-0.0027516526,-0.031925455,-0.01642099,0.015350895,-0.019874021,0.01708268,0.004642262,-0.0063199587,0.006349842,-0.00043108614,-0.008682491,0.017651534,0.011597889,0.012859101,0.013810073,-0.013846036,-0.0091046505,-0.004385082,0.0380402,-0.01620897,0.020814337,-0.007335244,0.00077784405,-0.0038162011,-0.007435945,-0.035926543,-0.013948113,-0.0009948979,-0.0010407622,-0.005615679,-0.01505883,0.0066906996,0.015005273,0.0015971842,0.00459381,-0.005226043,0.005025893,-0.009327192,0.004990014,0.011615279,0.03132169,-0.006820989,-0.0023699491,0.0074549806,-0.016222864,0.0006731549,0.014507012,-0.018588012,0.0029414825,-0.011465004,-0.057877883,0.023800712,0.0068507595,-0.0009603996,-0.012665839,0.02877314,0.032739393,0.016078984,-0.0060727797,0.03362354,-0.030080216,-0.01957714,-0.008337573,0.0032676705,0.0024672032,0.02595885,-0.003946482,0.002443046,-0.01065739,-0.043625567,0.023243297,0.0041018375,-0.012464455,0.008399394,0.0018347163,0.014748907,-0.0059416033,0.036308467,-0.015798172,-0.0090828035,-0.00022733548,0.033801693,0.0017053599,0.0015251751,0.007292678,-0.021819059,-0.008929901,-0.015017964,-0.007855282,-0.019152725,0.019046502,-0.024190385,0.015946813,-0.014714473,0.0054858527,0.0026999388,0.003139086,0.010147523,-0.008187453,0.00916649,-0.00812094,0.001949565,0.023927778,-0.011107614,-0.0041782395,0.031132558,0.034166493,-0.01990021,0.0010341497,-0.028422868,-0.0061543053,0.003054712,-0.007208615,0.01531393,0.0009512353,0.010516773,-0.0038419412,0.0067062415,-0.010942239,0.030799894,-0.013806193,0.024653764,0.005564827,-0.05435349,-0.029441524,0.03567519,0.035256486,-0.018006554,-0.04291531,0.013204259,-0.00069769996,-0.012212445,0.033772103,0.006208615,0.0019600303,0.0004664682,-0.0053964155,-0.005904342,0.021547709,-0.051223896,0.0039287517,0.03708911,-0.0044754916,0.025076834,0.013220199,-0.0068059927,-0.0092313485,0.0021601564,0.009446559,-0.0018401884,-0.003940647,-0.012270842,-0.026196845,0.014191408,-0.018448502,0.039623734,-0.0074691856,-0.02211005,0.018420773,0.0023652762,0.015408305,-0.009034061,0.012552611,-0.002053464,0.0024139953,0.01713334,-0.012649831,0.0056002988,-0.0056751133,0.002343559,0.014448086,0.010155385,0.0022646575,0.0077623534,-0.0039718733,0.029848913,0.007100125,-0.0033614358,0.01700458,0.003331129,-0.0021711043,0.007673775,-0.002588829,-0.021442413,-0.011066629,-0.0030645216,0.012485741,-0.0073695243,0.009847449,0.029349059,-0.027752617,0.017707983,0.013013136,-0.01096749,0.0067818826,-0.0012366425,-0.03152207,0.0009098101,-0.01002965,-0.032429166,-0.0151098315,-0.029657796,0.008070021,-0.014498239,-0.01259976,-0.0068142535,-0.03148807,-0.035959903,-0.008277972,-0.027114237,-0.030629896,-0.0050632246,-0.014487185,-0.020221103,0.011086594,0.009871844,-0.014917418,0.0033720047,0.021867305,-0.019053947,-0.0068446775,-0.004476709,-0.03080321,0.009784823,0.038317833,0.024053365,-0.0062910034,0.0032957,-0.0031251991,0.0003291527,-0.007074459,0.029040564,0.00093198783,-0.015107352,-0.016758762,-0.012089227,0.005922901,0.016458469,0.011339117,-0.0104195615,-0.025758455,-0.015795564,-0.028913002,0.047070824,0.00023925636,0.011553096,0.02871142,0.0030407605,-0.02157708,0.008436771,-0.006838117,0.018176481,-0.0066097584,0.017950535,0.024446107,-0.027058668,0.0093521355,-0.012695172,-0.008992056,0.016819358,0.0024386768,0.013124565,0.0012826308,0.028154394,0.013328393,0.025988795,0.002588359,-0.0032296402,-0.026334573,-0.0011188163,-0.007816892,-0.025026824,0.008620769,-0.009648101,0.012739384,0.007860443,0.006965457,-0.0077783703,0.017828073,-0.011528182,-0.013591076,0.009639,0.016231906,-0.0006224231,-0.0139030665,0.010234574,0.004581809,0.0026398485,-0.00315194,-0.0053313063,0.013457227,-0.0076378835,-0.0020454545,-0.017418854,-0.0044497154,-0.007297656,-0.016451651,0.021102712,-0.018105328,-0.032517746,0.01085567,0.021279722,0.02295441,0.03145512,-0.026194964,0.015179748,0.004330935,-0.012356963,0.005382802,-0.0016377354,0.0013587987,-0.012461139,0.009543913,0.036388565,-0.03005852,0.003112092,0.0024302364,-0.032271165,0.029890116,-0.09674478,0.020274617,-0.0004099081,-0.019788329,-0.0054934197,0.010820391,0.0040539373,-0.023685236,0.0015551222,0.0062642917,-0.018110896,-0.01275001,-0.0002296924,0.026523579,-0.0016978948,0.0032974158,-0.035190795,-0.0032658458,0.0227652,-0.0023060977,0.004790135,0.008658671,0.006260618,0.03443912,-0.012002056,-0.019661635,0.030925779,0.0033724979,0.03153459,0.006709194,-0.03061473,-0.0317729,-0.009412298,0.017462773,0.010205458,0.014210703,0.031933192,-0.0075477445,0.0077689467,-0.014257351,0.0063170292,0.013053999,-0.01041904,-0.051202938,0.0150085315,0.0028687215,-0.0066501293,0.012445082,0.005139888,0.00789354,-0.046320822,-0.012980751,-0.004062609,-0.0033735088,-0.033222783,-0.00065323984,-0.030518413,-0.00035403555,-0.023680454,0.023318054,-0.009424661,-0.021989217,-0.021650353,0.022388995,-0.0026072767,0.0025940086,0.0133275045,0.031290926,-0.00539241,0.0153352255,-0.0021296698,-0.000806539,0.023110764,0.0102016665,-0.010156855,-0.00199067,-0.004703786,0.009369815,-0.0035655526,0.021790693,-0.005137914,-0.0027491245,-0.076986745,-0.019476494,-0.031608388,-0.01921761,0.029330924,-0.012283798,0.010845018,-0.020737443,0.00014535827,-0.020685008,-0.0052926885,-0.010401954,-0.0078024776,-0.03982231,0.018003806,-0.012665736,-0.0004889379,-0.015717704,0.026466783,-0.037504554,-0.0048303725,-0.013438587,0.029756794,0.0011942546,-0.017357523,-0.009224849,-0.010414974,0.014948353,0.0059809396,0.00021575617,-0.0028922076,-0.12645173,0.019609699,0.014880421,-0.011070978,-0.0056158043,0.0010522738,-0.0051508565,0.008999819,0.017352382,0.019021228,-0.009085832,-0.032628667,-0.023787014,-0.025716748,-0.001930073,0.102736875,-0.008884437,0.0045606685,-0.027947724,-0.038245175,-0.010491476,-0.037547782,-0.014228116,0.013979681,0.0026025134,0.004423858,0.024673345,-0.01906496,-0.025165731,0.03018928,0.006193177,0.0065157115,0.009016498,-0.013784897,-0.013841187,-0.015188084,-0.008513989,-0.037786614,-0.017219577,0.016278438,0.018915635,0.011243354,0.011437298,-0.0022567855,-0.02783415,0.0076546604,-0.007495159,-0.009546161,-0.004263053,-0.013873506,-0.014895824,-0.06673276,0.015639972,0.0028927918,-0.003994283,0.0043396116,-0.018613968,0.03472475,0.02661154,0.015405167,0.019350156,0.0051714866,-0.016486324,0.037155587,-0.0120262755,0.0075392583,0.016942864,0.018866608,-0.0059933932,0.011246668,-0.018517943,-0.0027624364,-0.007244075,-0.015611846,-0.008794827,-0.018438125,0.025587006,0.006532272,0.031803273,0.013645409,0.0052975775,-0.009247522,-0.0016108576,0.012464423,0.00046079516,-0.008261124,-0.012088656,0.008276706,-0.002425704,-0.0015223855,0.013359295,0.010787596,0.016755862,0.0039933813,0.0012143342,0.020913186,-0.001632026,0.0031106486,-0.0050000907,-0.0058188187,0.0026861655,-0.026864767,0.0012762067,-0.019064363,0.010597138,0.021574123,0.007806121,0.043909803,0.033683497,-0.0015786088]",{"tags":37,"relatedLang":47,"relatedPosts":51},[38,40,42,43,45],{"name":15,"slug":39},"retrieval-augmented-interaction",{"name":17,"slug":41},"preference-modeling",{"name":16,"slug":16},{"name":14,"slug":44},"weak-rewards",{"name":13,"slug":46},"llm-user-modeling",{"id":27,"slug":48,"title":49,"language":50},"weak-rewards-persistent-llm-user-models-en","Weak Rewards for Persistent LLM User Models","en",[52,58,64,70,76,82],{"id":53,"slug":54,"title":55,"cover_image":56,"image_url":56,"created_at":57,"category":26},"7c89c3bd-48cb-4b4e-942d-bbf0409fc392","cattle-trade-llm-bluffing-bargaining-benchmark-zh","Cattle Trade 要測 LLM 談判 bluffing","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779085437419-b0zw.png","2026-05-18T06:23:27.885037+00:00",{"id":59,"slug":60,"title":61,"cover_image":62,"image_url":62,"created_at":63,"category":26},"9580adce-69ec-4880-ad8b-227c384cb377","marlin-greener-llm-inference-datacenters-zh","MARLIN 用多代理 RL 省雲端推理資源","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779084247021-qzhd.png","2026-05-18T06:03:35.259834+00:00",{"id":65,"slug":66,"title":67,"cover_image":68,"image_url":68,"created_at":69,"category":26},"e3f8d32d-9094-4717-b9fd-d799de0e521b","weishenme-fensanshi-xitong-yanjiang-bi-buluoge-wenzhang-geng-zh","為什麼分散式系統演講比部落格文章更值得學","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779075234067-fff9.png","2026-05-18T03:33:21.6849+00:00",{"id":71,"slug":72,"title":73,"cover_image":74,"image_url":74,"created_at":75,"category":26},"0b28782b-fc24-49fc-bc5c-ec9c07c8ad46","wei-shen-me-sora-zheng-ming-ying-pian-ai-hai-mei-zhun-bei-ha-zh","為什麼 Sora 證明影片 AI 還沒準備好走向主流","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779059031003-tsg7.png","2026-05-17T23:03:22.155232+00:00",{"id":77,"slug":78,"title":79,"cover_image":80,"image_url":80,"created_at":81,"category":26},"aefdd28e-fccb-46ca-a78b-ad6ad718058d","microsoft-mdash-finds-16-windows-flaws-zh","Microsoft MDASH 找出 16 個 Windows 漏洞","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779041037625-66oq.png","2026-05-17T18:03:35.214691+00:00",{"id":83,"slug":84,"title":85,"cover_image":86,"image_url":86,"created_at":87,"category":26},"902b314d-316c-48aa-9a2a-e4d16f32d2ac","browser-exploit-benchmarks-prove-ai-security-here-zh","為什麼瀏覽器 exploit 基準已證明 AI 安全威脅就在眼前","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779019382261-mfmw.png","2026-05-17T08:03:21.360298+00:00",[89,94,99,104,109,114,119,124,129,134],{"id":90,"slug":91,"title":92,"created_at":93},"f18dbadb-8c59-4723-84a4-6ad22746c77a","deepmind-bets-on-continuous-learning-ai-2026-zh","DeepMind 押注 2026 連續學習 AI","2026-03-26T08:16:02.367355+00:00",{"id":95,"slug":96,"title":97,"created_at":98},"f4a106cb-02a6-4508-8f39-9720a0a93cee","ml-papers-of-the-week-github-research-desk-zh","每週 ML 論文清單，為何紅到 GitHub","2026-03-27T01:11:39.284175+00:00",{"id":100,"slug":101,"title":102,"created_at":103},"c4f807ca-4e5f-47f1-a48c-961cf3fc44dc","ai-ml-conferences-to-watch-in-2026-zh","2026 AI 研討會投稿時程整理","2026-03-27T01:51:53.874432+00:00",{"id":105,"slug":106,"title":107,"created_at":108},"9f50561b-aebd-46ba-94a8-363198aa7091","openclaw-agents-manipulated-self-sabotage-zh","OpenClaw Agent 會自己搞砸自己","2026-03-28T03:03:18.786425+00:00",{"id":110,"slug":111,"title":112,"created_at":113},"11f22e92-7066-4978-a544-31f5f2156ec6","vega-learning-to-drive-with-natural-language-instructions-zh","Vega：使用自然語言指示進行自駕車控制","2026-03-28T14:54:04.847912+00:00",{"id":115,"slug":116,"title":117,"created_at":118},"a4c7cfec-8d0e-4fec-93cf-1b9699a530b8","drive-my-way-en-zh","Drive My Way：個性化自駕車風格的實現","2026-03-28T14:54:26.207495+00:00",{"id":120,"slug":121,"title":122,"created_at":123},"dec02f89-fd39-41ba-8e4d-11ede93a536d","training-knowledge-bases-with-writeback-rag-zh","用 WriteBack-RAG 強化知識庫提升檢索效能","2026-03-28T14:54:45.775606+00:00",{"id":125,"slug":126,"title":127,"created_at":128},"3886be5c-a137-40cc-b9e2-0bf18430c002","packforcing-efficient-long-video-generation-method-zh","PackForcing：短影片訓練也能生成長影片","2026-03-28T14:55:02.688141+00:00",{"id":130,"slug":131,"title":132,"created_at":133},"72b90667-d930-4cc9-8ced-aaa0f8968d44","pixelsmile-toward-fine-grained-facial-expression-editing-zh","PixelSmile：提升精細臉部表情編輯的新方法","2026-03-28T14:55:20.678181+00:00",{"id":135,"slug":136,"title":137,"created_at":138},"cf046742-efb2-4753-aef9-caed5da5e32e","adaptive-block-scaled-data-types-zh","IF4：神經網路量化的聰明選擇","2026-03-31T06:00:36.990273+00:00"]