Category

Model Releases

Latest AI model releases, benchmarks, and comparisons. Stay up to date with every new model launch from OpenAI, Anthropic, Google, Meta, and more.

MiniMax-M1 brings 1M-token open reasoning model
May 15

MiniMax-M1 brings 1M-token open reasoning model

MiniMax released M1, an open-source reasoning model with 1M-token context, 80k output, and low-cost API pricing.

Gemini Omni Video Review: Text Rendering Beats Rivals
May 15

Gemini Omni Video Review: Text Rendering Beats Rivals

Gemini Omni’s leaked tests show sharp text rendering and in-chat editing, but quota limits and safety filters may slow adoption.

Why Xiaomi’s MiMo-V2.5-Pro Changes Coding Agents More Than Chatbots
May 14

Why Xiaomi’s MiMo-V2.5-Pro Changes Coding Agents More Than Chatbots

MiMo-V2.5-Pro matters because it is built for long, tool-heavy coding work, not chat.

OpenAI’s Realtime Audio Models Target Live Voice
May 11

OpenAI’s Realtime Audio Models Target Live Voice

OpenAI’s new realtime audio models aim at live translation, transcription, and voice agents for developers and creators.

Anthropic发布10款金融AI Agent
May 10

Anthropic发布10款金融AI Agent

Anthropic发布10款金融预构建AI Agent,并推出Claude Opus 4.7,强调它在金融任务上的表现。

Why Claude’s “Infinite” Context Window Still Won’t Make AI Autonomous
May 10

Why Claude’s “Infinite” Context Window Still Won’t Make AI Autonomous

Claude’s new context, coordination, and infrastructure upgrades are real, but they do not make AI autonomous.

Why Midjourney 8.1 Raw Mode Is Better Than Default Style
May 8

Why Midjourney 8.1 Raw Mode Is Better Than Default Style

Midjourney 8.1 raw mode is the right default for serious image work because it gives tighter prompt control and less stylized drift.

Why Kimi K2.5 Changes the Open-Source Agent Race
May 5

Why Kimi K2.5 Changes the Open-Source Agent Race

Kimi K2.5 makes open-source agents matter by pairing multimodal reasoning with tool-heavy execution.

AWS details RFT with LLM-as-a-judge for Nova
May 5

AWS details RFT with LLM-as-a-judge for Nova

AWS outlines reinforcement fine-tuning with LLM-as-a-judge, plus a legal contract review case study using Amazon Nova and SageMaker AI.

Kimi K2.6 Brings 256K Context to API Users
May 4

Kimi K2.6 Brings 256K Context to API Users

Kimi K2.6 adds 256K context, multimodal input, and stronger coding for developers using the Kimi API Platform.

Kimi K2.6 and Qwen 3.6 Narrow the Gap
May 4

Kimi K2.6 and Qwen 3.6 Narrow the Gap

Kimi K2.6 and Qwen 3.6 are open-weight models that now rival closed models on coding and agent tasks.

Kimi K2.6 Scores: BenchLM’s 2026 Breakdown
May 4

Kimi K2.6 Scores: BenchLM’s 2026 Breakdown

Kimi K2.6 ranks #12 overall on BenchLM, with strong coding and agentic scores, plus a 256K context window and open weights.

Gemini is coming to millions of cars
May 4

Gemini is coming to millions of cars

Google is rolling Gemini into Google built-in cars, starting in the U.S. with English support and wider rollout over time.

AI Models in 2026: Which One to Use
May 4

AI Models in 2026: Which One to Use

Gemini 3.1 Pro leads reasoning, Claude writes best, and Grok tops some coding tests, so the right pick depends on the task.

Qwen3.6-27B opens a smaller, sharper path to coding
Apr 27

Qwen3.6-27B opens a smaller, sharper path to coding

Qwen3.6-27B is a 27B dense multimodal model that beats Qwen3.5-397B-A17B on key coding benchmarks while staying easier to deploy.

OpenAI’s ChatGPT Images 2.0 lands with sharper edits
Apr 24

OpenAI’s ChatGPT Images 2.0 lands with sharper edits

OpenAI quietly shipped ChatGPT Images 2.0, and early tests show stronger edits, cleaner text, and faster image workflows for creators.

Anthropic’s Mythos Model Triggers Security Panic
Apr 24

Anthropic’s Mythos Model Triggers Security Panic

Anthropic’s Mythos reportedly finds software flaws fast enough to worry governments, banks, and grid operators worldwide.

Claude Opus 4.7 发布:更会干活了
Apr 22

Claude Opus 4.7 发布:更会干活了

Anthropic发布Claude Opus 4.7,长任务、视觉理解和代码工作流更强,但Token消耗也更高。

Qwen3.6-35B-A3B: 35B Open Source Model Release
Apr 20

Qwen3.6-35B-A3B: 35B Open Source Model Release

Qwen3.6-35B-A3B ships with 35B total params, 3B active params, and Anthropic API compatibility for Claude Code workflows.

Claude Design Launches: Anthropic's AI Design Tool Enters Beta
Apr 19

Claude Design Launches: Anthropic's AI Design Tool Enters Beta

Anthropic Labs launched Claude Design on April 17, letting users generate prototypes, slides, one-pagers, and marketing collateral through conversation. Powered by Opus 4.7 and available to Pro, Max, Team, and Enterprise subscribers as a research preview, Claude Design reads a team's codebase and design files during onboarding to auto-build a design system of colors, typography, and components that every subsequent project inherits.

Gemini最新アップデート総まとめ:Mac版・Deep Research・Canvas強化
Apr 18

Gemini最新アップデート総まとめ:Mac版・Deep Research・Canvas強化

GeminiはMacアプリ、Deep Research強化、Canvas拡張を追加。学生向け無料枠も広がり、使い道が一気に増えました。

Linux 7.0 lands with Rust and AI-finding bugs
Apr 17

Linux 7.0 lands with Rust and AI-finding bugs

Linux 7.0 ships with official Rust support, fresh CPU work, and Torvalds saying AI tools may keep finding kernel bugs.

OpenAI Limits GPT-5.4-Cyber to Trusted Firms
Apr 16

OpenAI Limits GPT-5.4-Cyber to Trusted Firms

OpenAI is limiting GPT-5.4-Cyber to vetted partners as it pushes AI deeper into security testing and dual-use risk management.

OpenAI launches GPT-5.4-Cyber for defense work
Apr 15

OpenAI launches GPT-5.4-Cyber for defense work

OpenAI's GPT-5.4-Cyber targets defensive security tasks after Anthropic's Mythos debut, tightening the race for AI-powered cyber tools.

GPT-5.4 Scores 97.6 in Knowledge Benchmarks
Apr 13

GPT-5.4 Scores 97.6 in Knowledge Benchmarks

GPT-5.4 tops knowledge benchmarks with 97.6, ranks #2 overall on BenchLM, and posts a 1.05M-token context window.

Claude Mythos Preview Tops GPT-5.4 on Key Benchmarks
Apr 13

Claude Mythos Preview Tops GPT-5.4 on Key Benchmarks

Anthropic’s unreleased Mythos Preview beats GPT-5.4 and Gemini 3.1 Pro on coding, math, and agent tests, led by 97.6% on USAMO.

OpenAI Revenue, Valuation, and Funding in 2026
Apr 12

OpenAI Revenue, Valuation, and Funding in 2026

OpenAI says annualized revenue hit $25B in Feb. 2026, with an $852B valuation after a $122B round and IPO prep underway.

Kimi K2.5: Moonshot’s Open Model Joins the Elite
Apr 4

Kimi K2.5: Moonshot’s Open Model Joins the Elite

Moonshot AI’s Kimi K2.5 launched on Jan. 27, 2026, with 256K context, Agent Swarm, and benchmark results that challenge GPT-5.4.

Cursor 3 Workspace: Multi-Agent Workflow in One Place
Apr 4

Cursor 3 Workspace: Multi-Agent Workflow in One Place

Cursor 3 moves agent work into one workspace, with parallel sessions, cloud handoff, and a new path from commit to PR.

Gemma 4 lands on Google Cloud
Apr 4

Gemma 4 lands on Google Cloud

Google Cloud brings Gemma 4 to Vertex AI, Cloud Run, GKE, and TPUs, with 256K context, vision, audio, and Apache 2.0 licensing.

You've reached the end