[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-vision-language":3},{"tag":4,"articles":10},{"id":5,"name":6,"slug":6,"article_count":7,"description_zh":8,"description_en":9},"fabc42ac-43e0-4090-a27c-92dcce597044","vision-language",3,"視覺語言模型把影像、文字與推理接到同一條管線，常見於圖文問答、偏好對齊與多模態 MoE。這個主題關注模型如何看懂畫面、選對專家並在任務規則下做出更穩定的判斷。","Vision-language models connect images, text, and reasoning in one pipeline, powering tasks like VQA, preference alignment, and multimodal MoE. This topic centers on how models interpret visuals, route to the right experts, and stay reliable under task-specific constraints.",[11,20],{"id":12,"slug":13,"title":14,"summary":15,"category":16,"image_url":17,"cover_image":17,"language":18,"created_at":19},"d3ac3e85-c296-4015-94f0-559222351ea3","rubric-based-dpo-visual-preference-tuning-zh","用 rubric 讓視覺偏好訓練更精準","rDPO 用每個圖文任務的專屬 rubric 取代粗粒度偏好訊號，讓視覺偏好最佳化更細緻，並在過濾與 benchmark 上帶來提升。","research","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1776233216658-4juh.png","zh","2026-04-15T06:06:32.083225+00:00",{"id":21,"slug":22,"title":23,"summary":24,"category":16,"image_url":25,"cover_image":25,"language":18,"created_at":26},"0234cd33-a3a2-4600-b529-3ac20153980f","multimodal-moe-routing-distraction-zh","多模態 MoE 為何會分心","這篇研究指出，多模態 MoE 不是只卡在看圖，而是路由把輸入送錯專家。作者提出 routing distraction，並用路由引導介入提升 domain expert 啟動與推理表現。","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775801393563-1ajo.png","2026-04-10T06:09:34.33472+00:00"]