[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-llm-fine-tuning":3},{"tag":4,"articles":11},{"id":5,"name":6,"slug":7,"article_count":8,"description_zh":9,"description_en":10},"93aa15ea-c3f0-4f7d-a7c2-b22a81051ec1","LLM fine-tuning","llm-fine-tuning",3,"LLM 微調指的是在既有基礎模型上，透過監督式資料或強化學習調整模型行為，讓它更貼近特定任務與領域。這個主題涵蓋資料準備、訓練穩定性、評估與部署，例如 PPO 的替代方法、BPO\u002FGBPO，以及用 S3、SageMaker 和 MLflow 加速實作。","LLM fine-tuning covers the methods used to adapt a base model to a specific task or domain, from supervised training to RL-based alignment. It matters because stability, data pipelines, and tooling shape real outcomes; examples include BPO\u002FGBPO as PPO alternatives and AWS workflows with S3, SageMaker, and MLflow.",[12,21,28],{"id":13,"slug":14,"title":15,"summary":16,"category":17,"image_url":18,"cover_image":18,"language":19,"created_at":20},"346a0a80-82ae-4b5a-90fe-552ba3791de7","why-latent-agents-proves-internalized-debate-en","Why Latent Agents Proves Multi-Agent Debate Should Be Internalized","Latent Agents shows multi-agent debate works best when a single model internalizes it.","research","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777944654721-4ftq.png","en","2026-05-05T01:30:23.124229+00:00",{"id":22,"slug":23,"title":24,"summary":25,"category":17,"image_url":26,"cover_image":26,"language":19,"created_at":27},"19f116fd-02dd-4a7d-9638-75a3bb70cae2","bounded-ratio-reinforcement-learning-ppo-en","Why Bounded Ratio RL Replaces PPO's Clipped Objective","BRRL gives PPO a cleaner theory, with BPO and GBPO aiming for more stable policy updates in control and LLM fine-tuning.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776751796218-p4in.png","2026-04-21T06:09:40.318224+00:00",{"id":29,"slug":30,"title":31,"summary":32,"category":33,"image_url":34,"cover_image":34,"language":19,"created_at":35},"4a3e15ba-07e8-4e4d-b5c8-d9a46deea8bd","aws-s3-sagemaker-unified-studio-fine-tuning-en","AWS uses S3 to speed LLM fine-tuning","AWS shows how SageMaker Unified Studio, S3, and MLflow can fine-tune Llama 3.2 11B Vision Instruct on DocVQA data.","model-release","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775139362238-r31j.png","2026-04-02T14:15:38.340988+00:00"]