[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-cattle-trade-llm-bluffing-bargaining-benchmark-zh":3,"article-related-cattle-trade-llm-bluffing-bargaining-benchmark-zh":36,"series-research-7c89c3bd-48cb-4b4e-942d-bbf0409fc392":85},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":30,"topic_cluster_id":34,"embedding":35,"is_canonical_seed":20},"7c89c3bd-48cb-4b4e-942d-bbf0409fc392","Cattle Trade 要測 LLM 談判 bluffing","\u003Cp data-speakable=\"summary\">Cattle Trade 提出一個\u003Ca href=\"\u002Fnews\u002Fmarlin-greener-llm-inference-datacenters-zh\">多代理\u003C\u002Fa>基準，專門測試 \u003Ca href=\"\u002Fnews\u002Fweak-rewards-persistent-llm-user-models-zh\">LLM\u003C\u002Fa> 在 bluff、出價與談判中的策略行為。\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>研究機構\u003C\u002Fstrong>：arXiv 摘要未明確標註\u003C\u002Fli>\u003Cli>\u003Cstrong>核心數據\u003C\u002Fstrong>：摘要無公開 benchmark 數字\u003C\u002Fli>\u003Cli>\u003Cstrong>突破點\u003C\u002Fstrong>：多代理談判基準\u003C\u002Fli>\u003C\u002Ful>\u003Cp>\u003Ca href=\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.14537\">Cattle Trade: A Multi-Agent Benchmark for LLM Bluffing, Bidding, and Bargaining\u003C\u002Fa> 不是在測一般問答，而是把 LLM 拉進更接近真實互動的場景：多方談判、互相試探、開價與讓步。這篇摘要很直接地指出，現有評測多半偏靜態提示詞，但真正的代理系統常常要在有利益衝突的情境下做決策。\u003C\u002Fp>\u003Cp>這件事對開發者很重要。因為一個模型會答題，不代表它會談判；會談判，也不代表它能穩住立場、看穿對手的 bluff，或在多輪互動裡維持一致策略。只要你的產品碰到銷售、採購、市集撮合，甚至任何需要多方協商的流程，這種基準就比單純 QA 更貼近實戰。\u003C\u002Fp>\u003Ch2>這篇論文要補哪個洞\u003C\u002Fh2>\u003Cp>標題已經把問題講得很清楚：bluffing、bidding、barga\u003Ca href=\"\u002Fnews\u002Fwei-shi-mo-minimax-geng-xiang-xiao-fei-ji-ai-gong-si-er-bu-s-zh\">ini\u003C\u002Fa>ng 不是單輪回答。這些任務需要模型推測對手的信念、誘因，以及下一步可能怎麼回應，而且這些判斷會隨著多輪互動持續變動。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779085437419-b0zw.png\" alt=\"Cattle Trade 要測 LLM 談判 bluffing\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>傳統 benchmark 的盲點就在這裡。你可以在一般測試裡拿到不錯分數，但仍然不知道模型遇到對手施壓時會不會露餡，或在價格談判中能不能守住底線。Cattle Trade 想補的，就是這種「策略互動」的評測空白。\u003C\u002Fp>\u003Cp>從目前提供的摘要來看，這篇比較像是在提出一個評測框架，而不是發表一個通用模型或新訓練法。也就是說，它的重點是把問題定義成可測量的多代理交易互動，而不是只看語言流暢度。\u003C\u002Fp>\u003Ch2>方法在做什麼\u003C\u002Fh2>\u003Cp>從名稱來看，這個 benchmark 把談判包裝成 cattle trade 的交易遊戲。代理可以 bluff、出價、協商條件，並在過程中互相影響彼此決策。這種設定的好處是，它逼模型在不完全資訊下做選擇，而不是照著提示詞直接生成一段漂亮答案。\u003C\u002Fp>\u003Cp>白話一點說，這類任務通常會讓每個 \u003Ca href=\"\u002Ftag\u002Fagent\">agent\u003C\u002Fa> 只知道部分資訊，接著要它提出報價、回應報價，最後決定要不要讓步。重點不是文句寫得好不好，而是策略能不能跨多輪維持住。\u003C\u002Fp>\u003Cp>不過，這份 raw 資料沒有提供完整 protocol、任務結構或計分規則，所以不能替它補細節。能確定的是，它是一個多代理 benchmark，而且核心行為就是交易過程中的 bluff、bid 與 bargain。\u003C\u002Fp>\u003Ch2>這篇實際證明了什麼\u003C\u002Fh2>\u003Cp>就這份摘要本身來看，沒有公開 benchmark 數字、沒有勝率、沒有 accuracy，也沒有跟其他模型的比較。因此，不能從 raw 資料直接推導出某個模型表現提升了多少。\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779085438333-u896.png\" alt=\"Cattle Trade 要測 LLM 談判 bluffing\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>這點很重要。因為它代表目前可驗證的貢獻，主要是 benchmark 的概念與範圍，而不是一組可直接拿來做排行榜比較的結果。換句話說，這篇的價值在於「定義測試什麼」，不是「宣稱某個模型贏多少」。\u003C\u002Fp>\u003Cp>對習慣看 leaderboard 的讀者來說，這類論文常常沒那麼刺激，但其實很關鍵。很多時候，真正缺的是一個能抓到特定行為的測試場，而不是又一個通用分數。\u003C\u002Fp>\u003Ch2>對開發者的意義\u003C\u002Fh2>\u003Cp>如果你在做 agentic system，談判幾乎是最難的能力之一。模型可以很會講話，但不一定可靠；可以很合作，但不一定有策略；也可能很會算計，卻在多輪互動中前後不一。專門測 bluffing 與 bargaining 的 benchmark，正好能把這些問題提早攤開。\u003C\u002Fp>\u003Cp>這對很多產品場景都很實際。像是價格協商、商機篩選、合約初步處理、資源分配，或自動化客服升級流程，模型都可能要代表使用者跟另一方互動。這時候，能不能讀懂對方行為、調整策略，比單純的語言品質還重要。\u003C\u002Fp>\u003Cp>它也給研究者一個更具體的目標。與其抽象地問 \u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa>「會不會推理」，不如問它能不能辨識欺騙、維持談判立場，並對誘因變化做出合理反應。這種問題更接近多代理系統真正會遇到的難題。\u003C\u002Fp>\u003Ch2>限制與未解問題\u003C\u002Fh2>\u003Cp>最大的限制其實很明顯：摘要太短。raw 資料沒有提供 benchmark 數字、任務細節、資料規模、模型比較，也沒有說明這個 benchmark 相對既有方法到底有多難。\u003C\u002Fp>\u003Cp>我們也不知道它是合成資料、人工設計、模擬環境，還是混合形式。這很重要，因為談判情境的真實度，會直接影響結果能不能轉用到 production。若環境太簡化，模型在 benchmark 上學到的策略，未必能搬到真實市場互動。\u003C\u002Fp>\u003Cp>另一個未解問題是，它到底是在測 bluffing 本身，還是同時混進記憶、算術、指令遵循等能力。多代理任務很容易把很多技能混在一起，所以一個好的 benchmark 必須清楚說明自己在量什麼。\u003C\u002Fp>\u003Ch2>給台灣開發者的實際解讀\u003C\u002Fh2>\u003Cp>如果你正在做 LLM agent，這篇的訊號很清楚：下一代評測不能只看靜態問答。真實世界很多任務都不是「答對就好」，而是要在互動中判斷對方、保住策略、適時讓步，甚至識破對手在演戲。\u003C\u002Fp>\u003Cp>這種 benchmark 的價值，不只是學術上多了一個題庫。它更像是在提醒團隊，產品規格如果包含 negotiation、bidding、marketplace interaction，就不能只拿一般 benchmark 來保證可用性。模型在這裡的失誤，常常不是語法錯，而是策略錯。\u003C\u002Fp>\u003Cp>所以，Cattle Trade 比較像一個方向標。它把 LLM 評測從靜態輸出，往互動式、對抗式、策略式場景推了一步。對做 agent 的團隊來說，這一步很值得注意。\u003C\u002Fp>\u003Ch2>總結\u003C\u002Fh2>\u003Cp>Cattle Trade 想測的，是一般 benchmark 常漏掉的東西：LLM 在談判場景中的策略行為。就目前摘要能確認的內容，它提出的是一個多代理交易基準，核心聚焦在 bluff、出價與協商，而不是通用問答。\u003C\u002Fp>\u003Cp>雖然這份 raw 資料沒有公開完整 benchmark 數字，但方向很明確。對任何要把 LLM 放進互動式流程的開發者來說，這類評測比單輪測試更接近真實風險，也更接近產品會遇到的問題。\u003C\u002Fp>\u003Cul>\u003Cli>它把評測焦點從問答移到多輪談判。\u003C\u002Fli>\u003Cli>它強調 bluff、bidding、bargaining 這類策略互動。\u003C\u002Fli>\u003Cli>它提醒開發者：代理系統需要看互動能力，不只看語言能力。\u003C\u002Fli>\u003C\u002Ful>","Cattle Trade 提出一個多代理基準，專門測試 LLM 在 bluff、出價與談判中的策略行為。","arxiv.org","https:\u002F\u002Farxiv.org\u002Fabs\u002F2605.14537",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779085437419-b0zw.png",[13,14,15,16,17],"LLM benchmark","multi-agent","negotiation","bluffing","bargaining","zh",1,false,"2026-05-18T06:23:27.885037+00:00","2026-05-18T06:23:27.868+00:00","done","fed0da2e-2f90-422b-9656-68a017cab6b2","cattle-trade-llm-bluffing-bargaining-benchmark-zh","research","653c628b-7930-4183-9dbc-8e50cf85c479","published","2026-05-18T09:00:28.261+00:00",[31,32,33],"Cattle Trade 將 LLM 評測拉進多代理談判場景。","摘要沒有提供 benchmark 數字或完整實驗細節。","對做 agent 的團隊來說，這類測試比靜態 QA 更貼近實戰。","0c35a120-52fc-41fc-afa3-d404eb934158","[-0.024097284,-0.007043726,0.02573872,-0.10129716,-0.013296184,-0.018486315,-0.007706815,0.031194044,-0.0011661771,0.032303512,-0.0009310457,-0.026164176,0.017556572,-0.024618577,0.116833456,0.027814854,0.0001378693,0.022680562,0.02997278,-0.014322827,-0.013984256,-0.0009390288,-0.011528597,-0.002490017,-0.0147042805,0.009949074,0.013074191,0.009879682,0.018606933,-0.0026839394,0.0068349573,0.020177666,-0.017472556,0.04053935,0.015467067,0.010516689,0.016625484,0.018780325,0.01078805,0.015909132,-0.022321936,0.00022535418,-0.012335878,-0.01577197,-0.008346922,0.0099096,0.023447976,-0.026048796,-0.017210621,-0.0015033674,-0.036241386,0.03722668,0.023340827,-0.14039531,-0.013732666,-0.00023824513,-0.016066534,-0.011967691,0.017964968,-0.012783027,-0.010890921,0.018395731,-0.01811237,-0.018900475,-0.005685181,-0.010489918,0.0017404611,-0.002223177,-0.04083433,-0.029392077,-0.009221609,-0.005804665,0.011120701,-0.0359971,0.029014515,-0.03422662,-0.03034267,0.0066045197,0.022651618,0.006707314,0.012587351,-0.03619134,0.0077220765,-0.006109719,-0.002793989,-0.0095783025,0.016410295,0.012489665,0.011766181,0.011551475,-0.014178686,0.017089909,0.03300603,0.010496669,-0.016499884,-0.012055364,-0.014233845,-0.0069285817,-0.01144004,-0.007560628,-0.021156304,-0.015386389,-0.0019769254,0.016793022,0.031185552,-0.017166043,0.0034172223,0.009446239,-0.008756364,-0.00069435185,0.018550765,-0.017560095,0.0017369996,-0.012627014,-0.0060034785,-0.1360381,0.024771554,0.022222659,-0.008290264,0.035759877,-0.0037563166,0.015510748,0.0137027735,0.020769851,0.0042819628,-0.024310382,0.0029145323,-0.015909119,-0.04883349,-0.018722123,-0.024295807,-0.015788144,0.007295402,-0.013534865,0.00029002875,0.046357226,-0.008497537,0.0019258843,-0.020395044,-0.021146515,0.010706504,0.039863136,-0.0072527756,0.010522085,-0.031971715,-0.031213274,-0.03416964,-0.0010640001,-0.009797597,0.00019767408,0.025735695,-0.020942884,-0.0024640444,-0.006100365,-0.00048180463,-0.055898424,0.018823216,0.0025945534,0.02522696,0.047292862,-0.005698646,-0.02391308,-0.0074858465,-0.004894752,0.009434876,-0.00049610273,-0.004775932,-0.019251594,0.008335851,0.01105107,0.017236141,-0.007170794,-0.0024564292,0.00516496,0.030283675,-0.018295519,-0.016998818,0.036062278,-0.017670479,-0.023085047,0.008345731,0.009080408,-0.013291519,0.029891638,-0.01397082,0.0012547319,0.0069990396,0.024983589,0.019656967,0.029415755,-0.019479778,-0.01725037,0.0044246237,-0.026543392,-0.0024097657,-0.019716403,-0.021857033,0.024327965,-0.021454519,0.024779154,0.009565712,0.01805421,0.0022326817,-0.029522939,-0.005551162,-0.041317184,0.006585353,0.004221424,-0.020004475,-0.00049504713,0.012664426,0.011880886,-0.0059623583,-0.0076492396,-0.022910336,0.012894054,-0.010683562,-0.005373547,-0.0046444293,-0.019435678,0.017214961,-0.008917927,0.004601796,0.01340184,-0.016797788,-0.014441517,0.011124723,-0.020754915,-0.001987504,0.026601601,-0.00825199,-0.00522722,0.00775679,-0.011916824,0.01763114,-0.021350585,-0.018496212,0.026084278,0.011798392,0.007323845,-0.014504114,0.030532692,0.0042044423,0.040132992,0.0045227082,-0.01678296,0.015590554,0.0017401783,-0.01978043,0.006306287,0.001281941,0.0072611175,-0.007265435,-0.002499069,0.0015741802,-0.008808934,-0.014406912,0.0086280275,0.01332839,-0.006847675,-0.008960823,-0.003273998,-0.017783312,0.009636213,-0.010998886,0.0013251319,-0.00018560786,-0.0162663,-0.024622867,0.015000647,-0.036650013,0.032155216,0.018075878,-0.012172982,0.016182818,-0.017754106,-0.06440054,0.009043069,-0.003007986,0.0118097095,0.036796227,-0.017380757,0.007867986,0.025699813,-0.0072102374,0.008408243,-0.0059068487,0.00023154756,-0.00015957172,-0.02071378,0.008353896,0.022191646,-0.018878648,-0.0041652187,-0.02286255,0.002577361,-0.0025571396,0.02599375,-0.022878524,-0.016642204,-0.008402281,-0.016743625,0.013066824,0.04272185,-0.0077107167,-0.011956201,-0.012238472,0.017731246,-0.019741772,-0.014944497,0.002350367,-0.029808339,0.013936222,-0.02128209,0.018145176,-0.0016418047,0.010458611,-0.027067611,-0.0032360805,-0.009191213,0.0034465983,-0.018183362,-0.028634246,0.00032817107,0.005035591,-0.017637115,0.013439134,0.021283273,0.005538986,-0.023390077,-0.008774503,0.0022009625,0.027711073,-0.009657575,0.017923364,-0.016657338,-0.022993837,0.00072245684,-0.012700885,0.01663441,0.007152892,0.016165664,-0.008024159,-0.0028388132,-0.012397186,0.023527171,0.0055429754,0.044158455,-0.008714259,-0.03752479,0.020213265,-0.001090358,-0.012594493,-0.022653706,-0.04633384,0.0074092485,-0.009981256,-0.02431285,0.038145076,0.015391382,-0.0056460593,0.0069468035,-0.0023434118,-0.019714126,0.021028303,-0.042114086,0.019901488,0.0021132953,-0.006428788,0.024083436,-0.009926055,-0.0028534562,-0.015177545,0.013892842,-0.0136290435,0.010847299,-0.017201738,-0.0128615135,-0.011030244,0.0010103467,-0.0107288305,0.01150234,-0.020602608,0.008024865,-0.016334822,-0.017160643,0.04174696,-0.007753213,-0.004438683,0.011923145,0.0015063654,0.01726118,-0.001001061,0.014862836,0.020630488,-0.012818011,-0.0035680986,0.021255486,-0.0014437593,-0.01848291,0.012259238,0.018103665,-0.0031316862,0.003012934,0.003205326,0.03233335,-0.031470574,0.01814257,0.014037202,-0.01669168,-0.0040618237,0.0064409017,-0.008379047,0.017291432,0.024973627,-0.019000798,0.01862503,0.00037224495,0.009534188,0.00021169714,-0.019528585,0.01332989,0.009281626,-0.00010629598,-0.009299546,-0.0085681025,-0.021406317,-0.007846108,0.0014811343,-0.03666481,-0.0059896163,-0.017156485,-0.0061316728,-0.01810824,-0.019371584,-0.011293669,-0.03178429,-0.036013383,-0.014842991,-0.026949352,-0.017509034,0.002790229,-0.0073293704,0.013190479,0.033182878,-0.01965079,-0.005127956,-0.0187165,-0.014974023,0.023669992,0.022644317,0.019524127,0.03272167,0.004998105,-0.033978067,-0.0021672149,-0.0074002296,-0.016653314,0.00076671795,-0.018993061,0.014265957,-0.043287404,0.00023325051,0.038686894,0.023024,-0.010169798,0.017749531,-0.00068093295,0.013926268,0.030775782,-0.016307447,0.01005014,0.016531382,0.0005438273,0.0033439316,-0.013428308,-0.008414529,0.02314678,0.018383924,0.0080417255,0.00306798,-0.0066212206,0.005570683,0.0013938088,-0.006771021,0.0073692733,-0.01176224,-0.0017909294,-0.011079037,0.018502438,0.034720067,-0.0059034918,0.0009755272,0.009938785,-0.0046909745,0.0076505416,-0.009238433,-0.020217387,0.0018619244,-0.0004363415,0.02485745,-0.030773029,0.0043978067,0.007078526,-0.0030149166,-0.008772068,-0.011713513,-0.009974497,0.020232808,-0.0122624915,-0.022780748,-0.016275961,0.015251019,0.016291052,-0.0100219315,-0.02156676,0.014121614,0.01319559,-0.019113053,0.005714753,0.009826725,-0.017053237,-0.0013328897,0.010996461,-0.018148866,-0.011804329,0.024221368,7.195779e-06,0.022966458,0.046585564,-0.018622587,0.0128628295,0.004390985,-4.7153077e-05,0.0051002223,-0.0068654423,0.02141556,0.015444971,-0.0044827587,0.0023022383,-0.044287264,-0.00026483182,0.011107264,-0.01683165,0.018039258,-0.09588028,-0.0062934975,0.017689716,-0.011071977,-0.006746769,-0.017431868,0.01135844,-0.022420706,0.01260173,-0.014301162,0.012857924,0.009919214,0.017611835,0.013612333,-0.015540854,-0.011593891,-0.014644878,-0.0020519027,-0.014803862,-0.0029927779,0.035028618,-0.0017068713,0.025830675,0.009399456,0.019885095,-0.016185699,0.021254828,0.0016236313,0.020293066,0.00803113,-0.014314711,-0.0076651685,-0.015071318,-0.0074295537,0.013391869,0.0014532529,0.046943918,0.0076053175,0.0030834412,0.0114299115,-0.0042502033,-0.0050875875,-0.02810785,-0.03311557,0.00089263875,-0.021310482,-0.016528176,0.014255551,-0.017528633,-0.009755177,-0.016248155,-0.029917266,-0.024412367,-0.029422374,0.011735507,-0.014277225,-0.013516399,-0.009040248,-0.010686581,0.032877095,-0.019471636,0.022974068,-0.013834785,0.006053788,-0.0025336687,7.3808995e-05,0.0041790227,0.01609305,0.0008260504,0.019329814,0.014062605,-0.020032227,-0.016984873,0.0049217325,-0.02717277,-0.02060199,0.020434435,0.029158885,-0.009371126,0.004941633,-0.03011129,-0.024952386,-0.096931145,0.0022716706,-0.0024920749,-0.007639011,0.017765414,-0.01116898,-0.00044878546,-0.02452187,-0.012584976,-0.0045741093,0.0082443105,0.020671453,-0.015002883,-0.019980717,0.014150219,0.0034312897,-0.0109478645,0.008065392,0.023053572,-0.027833028,0.0036899196,4.4682423e-05,0.018894155,-0.0069076726,-0.016570322,0.009993371,-0.01731218,-0.009687148,-0.010738507,-0.009455917,-0.005484938,-0.15251666,0.008934866,-0.00970469,-0.00279978,0.035450943,0.0029834714,0.008229102,0.00934714,0.025821205,0.012366703,0.0036566989,-0.042261817,0.016008092,-0.020985736,0.0049186107,0.115839645,0.013450801,0.018328672,-0.010230367,-0.013486355,0.00020437897,-0.013343081,-0.046288375,-0.0126210265,0.01688701,0.0077564917,0.024905385,-0.02117917,0.008512099,0.024323963,0.022879364,0.0220274,-0.0035616774,-0.009989017,-0.00487933,0.022349268,-0.018186087,-0.014237714,-0.05101285,-0.007297552,0.013048037,-0.0022140709,0.023960546,-0.009244422,-0.009134016,-0.009600879,-0.012129181,-0.019303944,0.019420056,-0.00770941,0.005747322,-0.06426955,-0.005908079,-0.010416936,-0.006566901,-0.007629434,0.01227515,-0.00031453697,0.026947156,0.0015330942,0.013951793,-0.025345838,-0.003317273,0.025521234,0.003428052,0.0029149682,0.0069134897,-0.0038687869,-0.0024396365,0.02537142,0.011140193,0.0033703838,0.01975637,-0.006417436,-0.0075499043,-0.015545212,0.02866986,0.019809738,0.006732755,-0.0027419953,0.0025679218,0.0138222175,0.01899096,-0.013853741,-0.0010808808,0.017952293,0.018829606,-0.0042445143,-0.012967045,-0.0019618103,-0.014358771,0.014591799,0.017540017,-0.0060908804,0.017646259,0.0034099666,-0.009435322,0.0076040975,0.0061808974,-0.0022857178,-0.0030682443,-0.0067493557,0.030276291,-0.010726996,0.013877006,0.034775347,0.004760927,0.010907676,0.012128067,0.007254582]",{"tags":37,"relatedLang":44,"relatedPosts":48},[38,39,40,41,42],{"name":15,"slug":15},{"name":17,"slug":17},{"name":16,"slug":16},{"name":14,"slug":14},{"name":13,"slug":43},"llm-benchmark",{"id":27,"slug":45,"title":46,"language":47},"cattle-trade-llm-bluffing-bargaining-benchmark-en","Cattle Trade benchmarks LLM bluffing and bargaining","en",[49,55,61,67,73,79],{"id":50,"slug":51,"title":52,"cover_image":53,"image_url":53,"created_at":54,"category":26},"492aa1ec-02ce-491e-ad03-ae804f261f87","weak-rewards-persistent-llm-user-models-zh","弱回饋讓 LLM 記住偏好","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779084838002-5od2.png","2026-05-18T06:13:32.906335+00:00",{"id":56,"slug":57,"title":58,"cover_image":59,"image_url":59,"created_at":60,"category":26},"9580adce-69ec-4880-ad8b-227c384cb377","marlin-greener-llm-inference-datacenters-zh","MARLIN 用多代理 RL 省雲端推理資源","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779084247021-qzhd.png","2026-05-18T06:03:35.259834+00:00",{"id":62,"slug":63,"title":64,"cover_image":65,"image_url":65,"created_at":66,"category":26},"e3f8d32d-9094-4717-b9fd-d799de0e521b","weishenme-fensanshi-xitong-yanjiang-bi-buluoge-wenzhang-geng-zh","為什麼分散式系統演講比部落格文章更值得學","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779075234067-fff9.png","2026-05-18T03:33:21.6849+00:00",{"id":68,"slug":69,"title":70,"cover_image":71,"image_url":71,"created_at":72,"category":26},"0b28782b-fc24-49fc-bc5c-ec9c07c8ad46","wei-shen-me-sora-zheng-ming-ying-pian-ai-hai-mei-zhun-bei-ha-zh","為什麼 Sora 證明影片 AI 還沒準備好走向主流","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779059031003-tsg7.png","2026-05-17T23:03:22.155232+00:00",{"id":74,"slug":75,"title":76,"cover_image":77,"image_url":77,"created_at":78,"category":26},"aefdd28e-fccb-46ca-a78b-ad6ad718058d","microsoft-mdash-finds-16-windows-flaws-zh","Microsoft MDASH 找出 16 個 Windows 漏洞","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779041037625-66oq.png","2026-05-17T18:03:35.214691+00:00",{"id":80,"slug":81,"title":82,"cover_image":83,"image_url":83,"created_at":84,"category":26},"902b314d-316c-48aa-9a2a-e4d16f32d2ac","browser-exploit-benchmarks-prove-ai-security-here-zh","為什麼瀏覽器 exploit 基準已證明 AI 安全威脅就在眼前","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779019382261-mfmw.png","2026-05-17T08:03:21.360298+00:00",[86,91,96,101,106,111,116,121,126,131],{"id":87,"slug":88,"title":89,"created_at":90},"f18dbadb-8c59-4723-84a4-6ad22746c77a","deepmind-bets-on-continuous-learning-ai-2026-zh","DeepMind 押注 2026 連續學習 AI","2026-03-26T08:16:02.367355+00:00",{"id":92,"slug":93,"title":94,"created_at":95},"f4a106cb-02a6-4508-8f39-9720a0a93cee","ml-papers-of-the-week-github-research-desk-zh","每週 ML 論文清單，為何紅到 GitHub","2026-03-27T01:11:39.284175+00:00",{"id":97,"slug":98,"title":99,"created_at":100},"c4f807ca-4e5f-47f1-a48c-961cf3fc44dc","ai-ml-conferences-to-watch-in-2026-zh","2026 AI 研討會投稿時程整理","2026-03-27T01:51:53.874432+00:00",{"id":102,"slug":103,"title":104,"created_at":105},"9f50561b-aebd-46ba-94a8-363198aa7091","openclaw-agents-manipulated-self-sabotage-zh","OpenClaw Agent 會自己搞砸自己","2026-03-28T03:03:18.786425+00:00",{"id":107,"slug":108,"title":109,"created_at":110},"11f22e92-7066-4978-a544-31f5f2156ec6","vega-learning-to-drive-with-natural-language-instructions-zh","Vega：使用自然語言指示進行自駕車控制","2026-03-28T14:54:04.847912+00:00",{"id":112,"slug":113,"title":114,"created_at":115},"a4c7cfec-8d0e-4fec-93cf-1b9699a530b8","drive-my-way-en-zh","Drive My Way：個性化自駕車風格的實現","2026-03-28T14:54:26.207495+00:00",{"id":117,"slug":118,"title":119,"created_at":120},"dec02f89-fd39-41ba-8e4d-11ede93a536d","training-knowledge-bases-with-writeback-rag-zh","用 WriteBack-RAG 強化知識庫提升檢索效能","2026-03-28T14:54:45.775606+00:00",{"id":122,"slug":123,"title":124,"created_at":125},"3886be5c-a137-40cc-b9e2-0bf18430c002","packforcing-efficient-long-video-generation-method-zh","PackForcing：短影片訓練也能生成長影片","2026-03-28T14:55:02.688141+00:00",{"id":127,"slug":128,"title":129,"created_at":130},"72b90667-d930-4cc9-8ced-aaa0f8968d44","pixelsmile-toward-fine-grained-facial-expression-editing-zh","PixelSmile：提升精細臉部表情編輯的新方法","2026-03-28T14:55:20.678181+00:00",{"id":132,"slug":133,"title":134,"created_at":135},"cf046742-efb2-4753-aef9-caed5da5e32e","adaptive-block-scaled-data-types-zh","IF4：神經網路量化的聰明選擇","2026-03-31T06:00:36.990273+00:00"]