Model Releases
Latest AI model releases, benchmarks, and comparisons. Stay up to date with every new model launch from OpenAI, Anthropic, Google, Meta, and more.

MiniMax-M1 brings 1M-token open reasoning model
MiniMax released M1, an open-source reasoning model with 1M-token context, 80k output, and low-cost API pricing.

Gemini Omni Video Review: Text Rendering Beats Rivals
Gemini Omni’s leaked tests show sharp text rendering and in-chat editing, but quota limits and safety filters may slow adoption.

Why Xiaomi’s MiMo-V2.5-Pro Changes Coding Agents More Than Chatbots
MiMo-V2.5-Pro matters because it is built for long, tool-heavy coding work, not chat.

OpenAI’s Realtime Audio Models Target Live Voice
OpenAI’s new realtime audio models aim at live translation, transcription, and voice agents for developers and creators.

Anthropic发布10款金融AI Agent
Anthropic发布10款金融预构建AI Agent,并推出Claude Opus 4.7,强调它在金融任务上的表现。

Why Claude’s “Infinite” Context Window Still Won’t Make AI Autonomous
Claude’s new context, coordination, and infrastructure upgrades are real, but they do not make AI autonomous.

Why Midjourney 8.1 Raw Mode Is Better Than Default Style
Midjourney 8.1 raw mode is the right default for serious image work because it gives tighter prompt control and less stylized drift.

Why Kimi K2.5 Changes the Open-Source Agent Race
Kimi K2.5 makes open-source agents matter by pairing multimodal reasoning with tool-heavy execution.

AWS details RFT with LLM-as-a-judge for Nova
AWS outlines reinforcement fine-tuning with LLM-as-a-judge, plus a legal contract review case study using Amazon Nova and SageMaker AI.

Kimi K2.6 Brings 256K Context to API Users
Kimi K2.6 adds 256K context, multimodal input, and stronger coding for developers using the Kimi API Platform.

Kimi K2.6 and Qwen 3.6 Narrow the Gap
Kimi K2.6 and Qwen 3.6 are open-weight models that now rival closed models on coding and agent tasks.

Kimi K2.6 Scores: BenchLM’s 2026 Breakdown
Kimi K2.6 ranks #12 overall on BenchLM, with strong coding and agentic scores, plus a 256K context window and open weights.

Gemini is coming to millions of cars
Google is rolling Gemini into Google built-in cars, starting in the U.S. with English support and wider rollout over time.

AI Models in 2026: Which One to Use
Gemini 3.1 Pro leads reasoning, Claude writes best, and Grok tops some coding tests, so the right pick depends on the task.

Qwen3.6-27B opens a smaller, sharper path to coding
Qwen3.6-27B is a 27B dense multimodal model that beats Qwen3.5-397B-A17B on key coding benchmarks while staying easier to deploy.

OpenAI’s ChatGPT Images 2.0 lands with sharper edits
OpenAI quietly shipped ChatGPT Images 2.0, and early tests show stronger edits, cleaner text, and faster image workflows for creators.

Anthropic’s Mythos Model Triggers Security Panic
Anthropic’s Mythos reportedly finds software flaws fast enough to worry governments, banks, and grid operators worldwide.

Claude Opus 4.7 发布:更会干活了
Anthropic发布Claude Opus 4.7,长任务、视觉理解和代码工作流更强,但Token消耗也更高。

Qwen3.6-35B-A3B: 35B Open Source Model Release
Qwen3.6-35B-A3B ships with 35B total params, 3B active params, and Anthropic API compatibility for Claude Code workflows.

Claude Design Launches: Anthropic's AI Design Tool Enters Beta
Anthropic Labs launched Claude Design on April 17, letting users generate prototypes, slides, one-pagers, and marketing collateral through conversation. Powered by Opus 4.7 and available to Pro, Max, Team, and Enterprise subscribers as a research preview, Claude Design reads a team's codebase and design files during onboarding to auto-build a design system of colors, typography, and components that every subsequent project inherits.

Gemini最新アップデート総まとめ:Mac版・Deep Research・Canvas強化
GeminiはMacアプリ、Deep Research強化、Canvas拡張を追加。学生向け無料枠も広がり、使い道が一気に増えました。

Linux 7.0 lands with Rust and AI-finding bugs
Linux 7.0 ships with official Rust support, fresh CPU work, and Torvalds saying AI tools may keep finding kernel bugs.

OpenAI Limits GPT-5.4-Cyber to Trusted Firms
OpenAI is limiting GPT-5.4-Cyber to vetted partners as it pushes AI deeper into security testing and dual-use risk management.

OpenAI launches GPT-5.4-Cyber for defense work
OpenAI's GPT-5.4-Cyber targets defensive security tasks after Anthropic's Mythos debut, tightening the race for AI-powered cyber tools.

GPT-5.4 Scores 97.6 in Knowledge Benchmarks
GPT-5.4 tops knowledge benchmarks with 97.6, ranks #2 overall on BenchLM, and posts a 1.05M-token context window.

Claude Mythos Preview Tops GPT-5.4 on Key Benchmarks
Anthropic’s unreleased Mythos Preview beats GPT-5.4 and Gemini 3.1 Pro on coding, math, and agent tests, led by 97.6% on USAMO.

OpenAI Revenue, Valuation, and Funding in 2026
OpenAI says annualized revenue hit $25B in Feb. 2026, with an $852B valuation after a $122B round and IPO prep underway.

Kimi K2.5: Moonshot’s Open Model Joins the Elite
Moonshot AI’s Kimi K2.5 launched on Jan. 27, 2026, with 256K context, Agent Swarm, and benchmark results that challenge GPT-5.4.

Cursor 3 Workspace: Multi-Agent Workflow in One Place
Cursor 3 moves agent work into one workspace, with parallel sessions, cloud handoff, and a new path from commit to PR.

Gemma 4 lands on Google Cloud
Google Cloud brings Gemma 4 to Vertex AI, Cloud Run, GKE, and TPUs, with 256K context, vision, audio, and Apache 2.0 licensing.