[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-peft-bench-fine-tuning-methods-benchmark-en":3,"article-related-peft-bench-fine-tuning-methods-benchmark-en":36,"series-research-4ed1af1c-05fe-425c-a296-464dbfca0e73":87},{"id":4,"title":5,"content":6,"summary":7,"source":8,"source_url":9,"author":10,"image_url":11,"keywords":12,"language":18,"translated_content":10,"views":19,"is_premium":20,"created_at":21,"updated_at":21,"cover_image":11,"published_at":22,"rewrite_status":23,"rewrite_error":10,"rewritten_from_id":24,"slug":25,"category":26,"related_article_id":27,"status":28,"google_indexed_at":29,"x_posted_at":10,"tweet_text":10,"title_rewritten_at":10,"title_original":10,"key_takeaways":30,"topic_cluster_id":34,"embedding":35,"is_canonical_seed":20},"4ed1af1c-05fe-425c-a296-464dbfca0e73","PEFT-Bench compares fine-tuning methods fairly","\u003Cp data-speakable=\"summary\">PEFT-Bench standardizes how to compare PEFT methods across 27 NLP datasets and 7 techniques.\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>Research org\u003C\u002Fstrong>: Brno University of Technology + Kempelen Institute of Intelligent Technologies\u003C\u002Fli>\u003Cli>\u003Cstrong>Core data\u003C\u002Fstrong>: 27 NLP datasets\u003C\u002Fli>\u003Cli>\u003Cstrong>Breakthrough\u003C\u002Fstrong>: Unified end-to-end benchmark with PSCP cost scoring\u003C\u002Fli>\u003C\u002Ful>\u003Cp>For engineers working with large language models, the big question is not just which fine-tuning method performs best, but which one is actually worth the compute, memory, and deployment tradeoffs. This paper argues that the current PEFT landscape is hard to compare fairly, especially for autoregressive \u003Ca href=\"\u002Ftag\u002Fllms\">LLMs\u003C\u002Fa>, and proposes a benchmark to make those comparisons more consistent.\u003C\u002Fp>\u003Cp>The paper is also practical in a second way: it does not treat efficiency as an afterthought. Instead, it introduces a score that folds together trainable parameters, inference speed, and training memory usage. That matters if you are choosing between methods for a production workflow, not just a leaderboard.\u003C\u002Fp>\u003Ch2>What problem this paper is trying to fix\u003C\u002Fh2>\u003Cp>Parameter-efficient fine-tuning, or PEFT, exists because full fine-tuning of large language models is expensive. The raw paper frames the problem clearly: large models demand too much compute, too much storage, and too much energy for many teams, especially academic groups and smaller practitioners.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779179046277-spz9.png\" alt=\"PEFT-Bench compares fine-tuning methods fairly\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>PEFT methods try to reduce the number of trainable parameters while keeping downstream performance strong. But the authors say current evaluations are fragmented, difficult to reproduce, and often centered on a narrow slice of tasks or model types. In particular, they point out that many methods are mostly tested on GLUE and SuperGLUE, often on non-autoregressive models, which makes the results less representative of modern autoregressive \u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa> use.\u003C\u002Fp>\u003Cp>The paper also calls out a reproducibility problem: some prior work lacks open-source implementations or enough experimental detail to rerun the results cleanly. That makes it hard to compare methods honestly, and it can even encourage authors to copy numbers from related work instead of rerunning experiments in the same setup.\u003C\u002Fp>\u003Ch2>What PEFT-Bench actually is\u003C\u002Fh2>\u003Cp>PEFT-Bench is presented as a unified end-to-end benchmark for evaluating diverse PEFT methods on autoregressive LLMs. It defines the datasets, metrics, and methodology needed to compare methods in a fair and consistent environment.\u003C\u002Fp>\u003Cp>The benchmark covers 27 datasets across 12 unique tasks, grouped into three broad areas: natural language understanding and reasoning, math, and code generation. The NLU bucket is further split into GLUE, SuperGLUE, and other datasets. That mix matters because it moves beyond the usual NLU-only evaluation and includes harder generation-oriented tasks where efficiency tricks can affect correctness in different ways.\u003C\u002Fp>\u003Cp>To support this benchmark, the authors also introduce PEFT-Factory, a framework built on top of LLaMA-Factory and designed to implement off-the-shelf methods from the HuggingFace PEFT library. The point is not just to run one-off experiments, but to make it easier for researchers to plug in new PEFT methods and keep the evaluation setup consistent.\u003C\u002Fp>\u003Ch2>How the method works in plain English\u003C\u002Fh2>\u003Cp>At a high level, the benchmark has three parts: datasets and tasks, language models plus PEFT methods, and \u003Ca href=\"\u002Fnews\u002Fconfident-ai-llm-evaluation-metrics-guide-en\">evaluation metrics\u003C\u002Fa>. The workflow is straightforward: train a model with a selected PEFT method on a selected dataset, then compute the metrics for each method-model-dataset combination.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779179050915-jr3x.png\" alt=\"PEFT-Bench compares fine-tuning methods fairly\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The paper says PEFT-Bench is designed around supervised fine-tuning of an instruction-fine-tuned model, with instructions included in each sample through dataset-specific templates. That detail matters because it means the benchmark is not just measuring raw adaptation, but adaptation in an instruction-following setup closer to how many LLM systems are used today.\u003C\u002Fp>\u003Cp>The benchmark evaluates seven PEFT methods, though the abstract does not list them all. What it does say is that the methods are compared not only by task performance, but also by efficiency and stability under limited data. The authors also include stability experiments, which suggests they are interested in whether a method is robust, not just whether it can win on average.\u003C\u002Fp>\u003Ch2>What the paper shows\u003C\u002Fh2>\u003Cp>The paper’s headline result is a tradeoff, not a single winner. According to the notes, LoRA achieves better performance, while BitFit and LNTuning are more efficient. That is the kind of result developers actually need: the best method depends on whether you care more about raw quality or resource usage.\u003C\u002Fp>\u003Cp>The notes also say that PEFT methods can learn task structure but damage correctness on math problem solving and code generation. That is an important warning if you are considering PEFT for tasks where one wrong \u003Ca href=\"\u002Ftag\u002Ftoken\">token\u003C\u002Fa> can break an answer, a program, or a proof.\u003C\u002Fp>\u003Cp>Soft prompt-based methods are described as harder to train. The paper also includes stability experiments, but the abstract does not provide the detailed results or any benchmark numbers beyond the dataset and method counts. So if you are looking for exact scores, latency figures, or memory deltas, those are not present in the abstract excerpt provided here.\u003C\u002Fp>\u003Cp>To make the tradeoffs more explicit, the authors introduce PEFT Soft Cost Penalties, or PSCP. This metric combines trainable parameters, inference speed, and training memory usage into the final score calculation. In other words, it tries to reward methods that are not only accurate, but also feasible to run in real-world settings.\u003C\u002Fp>\u003Ch2>Why developers should care\u003C\u002Fh2>\u003Cp>If you are choosing a PEFT method for an internal model, a product prototype, or a research baseline, this paper is useful because it pushes the comparison away from a single accuracy number. It asks the more operational question: what does this method cost to train and serve?\u003C\u002Fp>\u003Cp>That matters especially for teams with limited GPUs or tight deployment budgets. A method that looks great on a benchmark but is slow, memory-hungry, or unstable may be the wrong choice once it leaves the lab. The PSCP metric is an attempt to encode that reality directly into evaluation.\u003C\u002Fp>\u003Cp>The benchmark is also helpful because it broadens the evaluation surface. By including math and code generation alongside standard NLU tasks, it makes it harder for a method to look strong only because it was tuned to a narrow benchmark family. For developers, that means a more realistic picture of where a PEFT method is likely to hold up and where it may fail.\u003C\u002Fp>\u003Ch2>Limitations and open questions\u003C\u002Fh2>\u003Cp>The source material still leaves some gaps. The abstract and notes do not provide the full list of the seven PEFT methods, the exact model family used beyond autoregressive LLMs, or the detailed per-task results. It also does not include benchmark numbers in the excerpt beyond dataset and method counts, so readers should not infer quantitative superiority beyond the qualitative findings stated here.\u003C\u002Fp>\u003Cp>There is also a broader limitation inherent to benchmarks: they can improve comparability, but they do not eliminate the need to think about your own workload. A method that is efficient in this benchmark may still behave differently on your dataset, your prompt format, or your deployment constraints.\u003C\u002Fp>\u003Cp>Still, PEFT-Bench is a useful step because it tries to standardize the evaluation stack itself. For a space where reproducibility and apples-to-apples comparison have been weak points, that kind of infrastructure can be as valuable as a new algorithm.\u003C\u002Fp>\u003Cul>\u003Cli>PEFT-Bench covers 27 datasets across NLU, math, and code generation.\u003C\u002Fli>\u003Cli>It compares 7 PEFT methods on autoregressive LLMs in a unified setup.\u003C\u002Fli>\u003Cli>PSCP adds trainable parameters, inference speed, and memory into one score.\u003C\u002Fli>\u003C\u002Ful>\u003Cp>In short, this paper is less about claiming a new best adapter and more about making PEFT comparisons more honest, reusable, and deployment-aware.\u003C\u002Fp>","PEFT-Bench standardizes how to compare PEFT methods across 27 NLP datasets and 7 techniques.","arxiv.org","https:\u002F\u002Farxiv.org\u002Fhtml\u002F2511.21285v3",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779179046277-spz9.png",[13,14,15,16,17],"PEFT","LoRA","benchmarking","LLMs","fine-tuning","en",0,false,"2026-05-19T08:23:37.63089+00:00","2026-05-19T08:23:37.618+00:00","done","b7ad98f8-b186-45d1-8393-1ff330f16b14","peft-bench-fine-tuning-methods-benchmark-en","research","d1c6850c-f832-471b-8beb-c0ebc809667d","published","2026-05-19T09:00:32.052+00:00",[31,32,33],"PEFT-Bench standardizes PEFT evaluation across 27 datasets and 7 methods.","The paper favors practical comparison by adding PSCP cost penalties for efficiency.","Results suggest a tradeoff: LoRA leads on performance, while BitFit and LNTuning are more efficient.","3103988e-c4fe-45e3-98ab-846500c9d507","[-0.018422052,0.0022998587,0.0065935086,-0.071991846,-0.023947597,-0.010220444,0.010916491,0.020872947,0.0029919466,0.009463528,0.010324723,-0.032268282,0.03561754,-0.006471055,0.11222229,0.029044418,-0.029514786,0.017152775,0.01559288,-0.022520168,0.008269065,0.037131853,-0.04312307,-0.008320424,-0.006331136,0.0054044384,0.013383979,0.004530819,0.031156117,0.024232596,-0.004847575,-0.00070863776,0.027960373,0.014732162,0.022418242,0.0029734785,0.014844409,0.015325395,0.012784815,0.0061435434,0.005537259,-0.0013552482,-0.0025297105,-0.029623011,-0.022625612,0.003592835,-0.0043302877,-0.030215787,-0.031590078,0.014656589,0.011368065,0.005335722,-0.0014672831,-0.1542996,0.014507656,-0.013811744,0.009275613,0.018297425,0.0032608768,0.0055266726,-0.025067134,0.0025678156,-0.0032820413,-0.012281613,-0.012722077,-0.02286309,0.012535669,-0.016935091,-0.020230424,-0.004851944,-0.0077203694,-0.013429225,0.010757578,-0.01900338,0.0064689456,-0.01779709,0.024406135,-0.012907718,0.010112588,0.014228974,0.009618081,-0.0043973075,0.0025625948,-0.018238796,-0.010667651,0.007889329,9.887607e-05,-0.016852822,0.01081086,-0.0034343835,0.010459582,0.014582356,-0.0007982898,-0.008936415,-0.0032966735,-0.0015544596,0.0070203007,0.0037503059,-0.00082510326,-0.007889643,-0.003520695,-0.0032675045,-0.0036777405,-0.00027596572,0.014787002,0.014849126,0.0020369568,-0.0012530348,-0.007342479,0.024915118,0.0048912847,-0.010727253,-0.0112518165,-0.0068209153,0.0049251257,-0.14137517,-0.016327972,0.00916243,-0.0138653815,-0.010279494,-0.013893349,0.011459392,0.0071576773,0.045146294,-0.0034650476,-0.02221443,0.029502556,0.011160337,-0.0147859305,0.00706581,-0.046178393,0.0073331627,0.034126386,-0.015700137,-0.012272648,0.015346157,0.0351527,-0.037797093,-0.010180537,-0.020293493,-0.008351912,0.014760894,0.0005486686,0.00060876954,-0.0047991653,-0.0077851117,-0.02290785,0.0027922757,-0.006146774,0.0065253363,0.00716645,-0.0026462423,-0.008098159,-0.024161408,0.023913892,-0.00785068,0.0052836025,0.00664978,-0.0065762945,0.024363413,-0.002159467,-0.0011097556,-0.012344767,0.0024406349,-0.008407793,0.022954395,0.015721507,0.0017387122,0.009913714,0.024355175,0.014192859,-0.009281635,0.018508183,-0.01766855,0.010587121,-0.02399014,-0.0051164906,0.004460979,0.016546631,0.0046114237,0.01816311,0.0045794593,0.014472914,-0.009899617,-0.001177596,-0.0047862064,-0.0004768393,0.00561231,0.028583817,0.020558525,-0.03159108,0.017100694,0.046495963,-0.019062879,-0.00039581425,-0.017242867,0.00046937418,0.009983893,0.00512868,0.029415479,-0.011947946,-0.019287745,0.0125215165,0.0011326354,-0.0016160473,-0.019695971,-0.00041471387,-0.027492415,-0.0016968073,-0.020757513,-0.022762263,-9.025949e-05,0.00889054,0.0041048676,-0.0036944274,-0.020119693,0.00337358,0.008124259,0.034515247,-0.023615949,-0.00015388707,0.0008114832,0.020163037,-0.015480202,-0.037004493,-0.0146250175,-0.0011214274,0.0007036585,-0.018834773,0.020080362,0.0029374545,0.013576074,0.00771829,-0.017118083,0.02445456,0.02944749,-0.023057323,-0.003992545,0.026229467,0.024175601,-0.023230355,0.02098318,0.008048088,0.02441414,0.033855412,-0.010782746,0.028302,-0.012630299,0.00093255844,-0.0070659462,-0.025429321,0.004746365,-0.026183257,-0.00824593,-0.014454927,0.0071811727,-0.00037542597,0.014774252,0.0020738598,0.0032820073,-0.0017513112,0.009585066,0.006113908,-0.012801953,0.01190389,0.0009375824,0.025296578,-0.020724915,-0.0054421634,0.045179054,-0.04048266,0.008526464,0.009112237,0.01949458,0.011232686,-0.03386269,-0.042305406,0.014025743,0.018730376,0.023059895,0.011923431,0.011065274,0.0073587326,0.023019757,-0.0059546703,0.020670421,-0.026029583,-0.02042201,0.004177398,0.008308868,-0.007784453,0.03079342,-0.0433191,0.009153491,-0.009151953,-0.020658989,0.01810527,-0.007957698,-0.021124935,0.020667745,0.00861881,0.0032235256,0.011232139,0.047708064,0.0109915985,-0.020189794,-0.0040409295,0.060902327,-0.0020784757,-0.028724493,0.011430798,-0.015502017,-0.014514231,-0.027352305,0.0007144854,-0.003822461,0.017303223,0.0091512315,-0.01506194,-0.027060006,0.0076456103,0.0074398243,-0.027170852,0.0073934277,-0.011204748,-0.006776328,-0.013431131,-0.010181055,0.028289896,-0.015214167,0.008868608,0.044079747,0.023504198,-0.037170332,-0.005700317,0.0015231842,-0.016658366,-0.0062276605,-0.033908382,-0.034530792,0.012844278,-0.003971215,-0.0022648657,0.019460572,-0.010986397,0.034506805,0.008813994,-0.006328905,-0.025401615,-0.0017492286,0.027059393,0.006497928,0.0021375276,-0.042135067,-0.01979272,0.03341397,-0.011626559,0.0006219359,0.031591393,-0.007413258,-0.01268105,0.023325248,-0.016381448,0.02041664,0.02920056,-0.02959312,-0.00988205,-0.03012178,-0.026494235,0.008364069,0.02714259,0.019631408,-0.0071576857,0.015357504,0.008675622,0.017675633,-0.0273636,-0.0055310023,0.011513369,0.012282903,0.02318696,0.022129629,-0.0073256255,-0.021896586,0.0068970625,-0.00059139467,0.0097158,0.023409693,-0.021675194,-0.0029539242,0.01011932,0.014797221,-0.0011985564,0.02346008,-0.0060201953,0.021518808,-0.009199138,-0.0014942326,0.0016817756,-0.023164455,0.0037209848,-0.010276099,-0.03656412,0.016525732,0.0042766524,0.0015985463,-0.028479906,0.023018235,-0.012466197,0.0020239921,-0.0060739466,-0.008555216,0.02570347,0.016830679,0.023325214,0.012585816,-0.0063414597,0.013667094,0.021655854,0.012382385,0.0038584783,0.045405336,0.0040216404,-0.010543246,-0.01120136,-0.006587424,-0.00022058211,-0.009946049,0.018145246,-0.01424949,-0.0047170445,-0.017964493,-0.022926485,-0.021667464,-0.036341034,-0.016161386,-0.024726527,0.022489594,-0.012187199,-0.019639434,-0.01438989,-0.01276997,-0.009153722,0.034223773,0.024699021,-0.024758276,0.007825539,-0.022192342,-0.01576377,-0.011254968,0.034462575,0.0064119925,0.02487918,0.013555184,-0.0053355265,-0.038938634,-0.014052912,-0.017593492,-0.025837695,-0.004769462,-0.025693843,-0.018712644,-0.022941338,0.032868795,-0.006467014,-0.0031868275,-0.01372428,-0.020945886,-0.03069813,-0.0049131466,-0.016353775,0.00078686257,0.02674776,0.0039317342,0.016354673,0.01994852,0.018908272,0.015726397,0.019100675,0.024639845,0.009194316,-0.012309514,-0.005148623,-0.011077965,-0.013400285,0.01820505,-0.0024406423,0.003110628,-0.014335714,0.0023203213,0.03310389,0.011898669,0.002992883,-0.0021243717,-0.018941576,-0.005704676,-0.022058077,-0.02355527,-0.010349678,0.019003222,-0.026499053,0.010129054,0.0248238,-0.01657279,-0.013254677,-0.024373332,-0.0043926514,0.008019574,0.003711339,-0.0063484064,-0.034026925,0.0048769754,0.03034252,0.03434944,-0.019772725,0.020814892,0.0068209125,0.01860702,-0.001979395,-0.02532776,-0.031331565,-0.0013728734,-0.020157602,0.0060541034,-0.014863884,-0.016691646,0.0037999658,0.0011956514,0.018102052,0.014414435,0.020785687,-0.018947141,-0.0020446505,-0.00070603285,0.005216079,-0.02835279,0.006525217,0.021785561,0.009130931,-0.005832155,0.0030245965,-0.018213455,-0.0031457313,-0.012309399,0.022195956,-0.08612243,0.014423386,0.019135244,-0.003920734,0.012043057,0.013783587,-0.0057117874,-0.0052003465,0.006193037,-0.007936215,0.021537822,-0.0026563467,0.0040521263,0.03548617,-0.0045737443,-0.027705131,-0.020547105,-0.03510352,0.021176105,-0.0073027913,0.018597586,0.009489117,0.016628878,0.011380726,0.0040421207,-0.015419634,0.0008275749,0.010906831,0.0053660097,-0.018900383,-0.028133878,-0.010568032,-0.0039092056,0.016275495,-0.0034167045,-0.010742195,0.013787646,-0.025633374,-0.002018949,-0.004846128,0.007915824,0.01988333,-0.032990042,-0.019747222,-0.015161948,0.005105325,-0.04205812,0.027000029,0.00039111075,0.00031898843,-0.0041311188,-0.0077008456,-0.013500751,-0.012296777,-0.011799944,-0.008739471,-0.026046226,0.0049182493,-0.012831625,-0.0011323714,-0.00480851,-0.017008897,-0.0027603973,0.03967777,-0.038290046,-0.0025477482,-0.0006455856,0.019101348,0.027105935,0.032157913,-0.00021031204,-0.051488142,0.0037357784,0.003533395,-0.0055875136,0.0072476175,-0.009908115,-0.016624244,-0.016887782,0.011533721,-0.051299077,-0.013800102,-0.06547733,-0.005499428,-0.011233522,0.000978876,0.040438097,-0.01649372,0.0006083945,-0.011561514,0.0076890993,-0.0076700663,-0.006148436,0.0039392603,-0.019276852,-0.011207855,0.0050226552,0.033585183,-0.0015640837,-0.004460179,-0.015299879,-0.023840556,0.012358655,-0.03648494,-0.025482614,-0.014601344,-0.022768268,0.021392044,-0.012967519,0.0025178746,-0.0019330949,-0.01716228,0.011871932,-0.15813892,-0.015887713,0.0057612127,0.026568703,0.004434132,-0.016389955,0.007379633,0.017900445,0.029702274,-0.006316502,-0.010474193,-0.04318934,-0.012441105,-0.0020567677,0.006523166,0.089150324,-0.007113918,0.008221631,-0.0012443673,-0.030416014,0.008269983,-0.024665175,-0.027744291,0.0072598574,0.0064435555,0.0031001272,0.01954699,-0.021186814,-0.007798105,0.026592057,-0.0026667316,-0.024104597,-0.0012435266,0.018176673,-0.00702839,-0.0032342891,-0.011337838,-0.0043749744,0.011435696,0.0024643813,0.01743431,0.029634077,0.02115506,0.021058733,-0.015722044,-0.018008618,0.0018111635,-6.7729474e-05,0.010261251,0.007158595,-0.013267301,-0.06368492,0.0108813895,0.00014472651,0.011898907,0.01707873,-0.032106034,0.022175016,0.0041925106,0.038321555,0.013695178,-0.017765632,0.016529549,0.008566233,-0.011890597,-0.014287907,0.009043042,0.0051946416,-0.005236254,-0.005263919,0.012046454,0.0044002444,0.018459724,0.012321888,-0.011177362,-0.023019327,0.032706838,0.0013786048,0.008166657,-0.031112006,0.009172757,0.03598535,0.017167622,0.019214159,0.023108449,0.006372788,-0.0047234446,-0.004144004,0.006693124,-0.022147238,0.015010343,0.020069886,-0.017297702,0.012715758,-0.016782606,0.008219533,-0.0042536063,-0.0009371836,0.0015794616,-0.0031355238,-0.025427338,-0.019362168,0.014914504,-0.032870788,0.003722777,-0.015214106,0.005347069,0.015031892,0.058415174,-0.0012096366]",{"tags":37,"relatedLang":46,"relatedPosts":50},[38,40,41,43,45],{"name":14,"slug":39},"lora",{"name":17,"slug":17},{"name":13,"slug":42},"peft",{"name":16,"slug":44},"llms",{"name":15,"slug":15},{"id":27,"slug":47,"title":48,"language":49},"peft-bench-fine-tuning-methods-benchmark-zh","PEFT-Bench 讓微調比較更公平","zh",[51,57,63,69,75,81],{"id":52,"slug":53,"title":54,"cover_image":55,"image_url":55,"created_at":56,"category":26},"180a8696-ada6-43c3-ac47-5b6cea8e0b31","confident-ai-llm-evaluation-metrics-guide-en","Confident AI’s guide to LLM evaluation metrics","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779178451812-i778.png","2026-05-19T08:13:46.826703+00:00",{"id":58,"slug":59,"title":60,"cover_image":61,"image_url":61,"created_at":62,"category":26},"576ffe2e-a54b-4030-84ea-8cc6eeb4f76f","code-becomes-the-agent-harness-en","Code Becomes the Agent Harness","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779173049719-vnmy.png","2026-05-19T06:43:30.92356+00:00",{"id":64,"slug":65,"title":66,"cover_image":67,"image_url":67,"created_at":68,"category":26},"3440bae8-d711-472c-8861-ef8ea63d39e8","rrfp-readiness-driven-pipeline-training-en","RRFP Makes Pipeline Training Follow Readiness","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779172447258-y1kc.png","2026-05-19T06:33:32.339315+00:00",{"id":70,"slug":71,"title":72,"cover_image":73,"image_url":73,"created_at":74,"category":26},"f15bbb27-837c-4841-9460-5c68d705e883","dashattention-differentiable-adaptive-sparse-attention-en","DashAttention makes sparse long-context attention differentiable","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779171841715-ussc.png","2026-05-19T06:23:34.566629+00:00",{"id":76,"slug":77,"title":78,"cover_image":79,"image_url":79,"created_at":80,"category":26},"074e9712-fc88-42c7-a98b-06e2571e6811","ibm-prompt-guide-turns-ai-guesses-into-outputs-en","IBM’s prompt guide turns AI guesses into outputs","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779132863560-rka7.png","2026-05-18T19:33:57.155747+00:00",{"id":82,"slug":83,"title":84,"cover_image":85,"image_url":85,"created_at":86,"category":26},"653c628b-7930-4183-9dbc-8e50cf85c479","cattle-trade-llm-bluffing-bargaining-benchmark-en","Cattle Trade benchmarks LLM bluffing and bargaining","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779085436536-nesm.png","2026-05-18T06:23:28.591525+00:00",[88,93,98,103,108,113,118,123,128,133],{"id":89,"slug":90,"title":91,"created_at":92},"a2715e72-1fe8-41b3-abb1-d0cf1f710189","ai-predictions-2026-big-changes-en","AI Predictions for 2026: Brace for Big Changes","2026-03-26T01:25:07.788356+00:00",{"id":94,"slug":95,"title":96,"created_at":97},"8404bd7b-4c2f-4109-9ec4-baf29d88af2b","ml-papers-of-the-week-github-research-desk-en","ML Papers of the Week Turns GitHub Into a Research Desk","2026-03-27T01:11:39.480259+00:00",{"id":99,"slug":100,"title":101,"created_at":102},"87897a94-8065-4464-a016-1f23e89e17cc","ai-ml-conferences-to-watch-in-2026-en","AI\u002FML Conferences to Watch in 2026","2026-03-27T01:51:54.184108+00:00",{"id":104,"slug":105,"title":106,"created_at":107},"6f1987cf-25f3-47a4-b3e6-db0997695be8","openclaw-agents-manipulated-self-sabotage-en","OpenClaw Agents Can Be Manipulated Into Failure","2026-03-28T03:03:18.899465+00:00",{"id":109,"slug":110,"title":111,"created_at":112},"a53571ad-735a-4178-9f93-cb09b699d99c","vega-driving-language-instructions-en","Vega: Driving with Natural Language Instructions","2026-03-28T14:54:04.698882+00:00",{"id":114,"slug":115,"title":116,"created_at":117},"a34581d6-f36e-46da-88bb-582fb3e7425c","personalizing-autonomous-driving-styles-en","Drive My Way: Personalizing Autonomous Driving Styles","2026-03-28T14:54:26.148181+00:00",{"id":119,"slug":120,"title":121,"created_at":122},"2bc1ad7f-26ce-4f02-9885-803b35fd229d","training-knowledge-bases-writeback-rag-en","Training Knowledge Bases with WriteBack-RAG","2026-03-28T14:54:45.643433+00:00",{"id":124,"slug":125,"title":126,"created_at":127},"71adc507-3c54-4605-bbe2-c966acd6187e","packforcing-long-video-generation-en","PackForcing: Efficient Long-Video Generation Method","2026-03-28T14:55:02.646943+00:00",{"id":129,"slug":130,"title":131,"created_at":132},"675942ef-b9ec-4c5f-a997-381250b6eacb","pixelsmile-facial-expression-editing-en","PixelSmile Framework Enhances Facial Expression Editing","2026-03-28T14:55:20.633463+00:00",{"id":134,"slug":135,"title":136,"created_at":137},"6954fa2b-8b66-4839-884b-e46f89fa1bc3","adaptive-block-scaled-data-types-en","IF4: Smarter 4-Bit Quantization That Adapts to Your Data","2026-03-31T06:00:36.65963+00:00"]