Compare
LLM 模型比較 2026
把主流模型的 Context、價格、Arena、Coding 與 Reasoning 表現整理成可篩選的比較頁。
涵蓋供應商
10
開源模型
9
最大 Context
10M
26 個模型
| 模型 ⇅ | 供應商 ⇅ | 發布 ⇅ | Context ⇅ | 價格 ($/1M) | Arena ELO ▼ | 程式能力 ⇅ | 推理 ⇅ | 速度 | 授權 |
|---|---|---|---|---|---|---|---|---|---|
GPT-5.4Unifies Codex + GPT; 1M context; built-in computer use | OpenAI | 2026-03 | 1M | $3/$15 | 1560 | unknown | unknown | ~50 t/s t/s | 閉源 |
Claude Opus 4.6#1 Arena Hard Prompts & Coding; 128K max output | Anthropic | 2026-02 | 1M | $5/$25 | 1549 | 80.8% SWE-bench | 65.4% Terminal-Bench | ~40 t/s t/s | 閉源 |
DeepSeek R1671B MoE (37B active); MIT license; distilled variants available | DeepSeek | 2025-01 | 128K | $0.55/$2.19 | 1500 | unknown | #1 Math & Coding Arena | ~45 t/s t/s | 開源 |
Gemini 3.1 Flash Lite#3 Arena overall; #1 creative writing; ultra-fast | 2026-03 | 1M | $0.10/$0.40 | 1492 | unknown | unknown | ~200 t/s t/s | 閉源 | |
Gemini 2.5 ProThinking model; top WebDev Arena 1415; native multimodal | 2025-03 | 1M | $1.25/$10 | 1470 | 75.6% LiveCodeBench | 84.6% GPQA Diamond | ~60 t/s t/s | 閉源 | |
Claude Sonnet 4.6Best value frontier; beats Opus 4.5 in 59% head-to-head | Anthropic | 2026-02 | 1M | $3/$15 | 1440 | 79.6% SWE-bench | 72.5% OSWorld | ~80 t/s t/s | 閉源 |
Qwen 3 235B235B MoE (22B active); Apache 2.0; strongest OSS competitive programming | Alibaba | 2025-04 | 128K | $0.86/$2 | 1422 | 70.7% LiveCodeBench | 2056 CodeForces ELO | ~65 t/s t/s | 開源 |
Mistral Large 3675B MoE (41B active); Apache 2.0; best cost-efficiency frontier | Mistral | 2025-12 | 256K | $0.5/$1.5 | 1418 | unknown | 43.9% GPQA Diamond | ~70 t/s t/s | 開源 |
o3Strongest OpenAI reasoning model | OpenAI | 2025-04 | 200K | $10/$40 | 1390 | unknown | unknown | ~30 t/s t/s | 閉源 |
Claude Opus 4.5Major price cut from Opus 4; strong agentic coding | Anthropic | 2025-11 | 200K | $5/$25 | 1380 | 80.9% SWE-bench | unknown | ~35 t/s t/s | 閉源 |
Kimi K21T params; Agent Swarm (100 agents); Modified MIT | Moonshot | 2025-07 | 128K | $0.55/$2.2 | 1380 | 65.8% SWE-bench | 60.2% BrowseComp | ~50 t/s t/s | 開源 |
DeepSeek V3.2~90% GPT-5.4 quality at 1/50th cost; best value model | DeepSeek | 2026-02 | 128K | $0.28/$0.42 | 1380 | unknown | unknown | ~80 t/s t/s | 開源 |
Grok 3Strong math/science; now legacy (Grok 4 series launched) | xAI | 2025-02 | 131K | $3/$15 | 1370 | unknown | 93.3% AIME 2025 | ~55 t/s t/s | 閉源 |
GPT-4oLegacy but still available; superseded by GPT-5 family | OpenAI | 2024-05 | 128K | $2.5/$10 | 1340 | 30.8% SWE-bench | unknown | ~100 t/s t/s | 閉源 |
Grok 4Top-5 Arena; strong reasoning & real-time X data | xAI | 2026-01 | 256K | $5/$25 | 1340 | unknown | unknown | ~45 t/s t/s | 閉源 |
Gemini 2.5 FlashCheapest frontier model at scale | 2025-03 | 1M | $0.30/$2.50 | 1330 | unknown | unknown | ~150 t/s t/s | 閉源 | |
Claude Haiku 4.5Fastest Claude, cheapest tier | Anthropic | 2025-10 | 200K | $0.8/$4 | 1290 | unknown | unknown | ~120 t/s t/s | 閉源 |
Claude Opus 4.7Opus 4.7 features a new tokenizer that inflates token counts by 35-45%. | Anthropic | 128K | N/A | t/s | 閉源 | ||||
GPT-5.5OpenAI's latest model, GPT-5.5, offers advanced capabilities for coding and complex tasks. | OpenAI | 128K | N/A | t/s | 閉源 | ||||
Llama 4 Scout10M context industry record; 109B MoE (17B active) | Meta | 2025-04 | 10M | $0.08/$0.3 | N/A | unknown | unknown | ~90 t/s t/s | 開源 |
Muse SparkMuse Spark achieves its reasoning capabilities using over an order of magnitude less compute than Llama 4 Maverick. | Meta | N/A | N/A | t/s | 閉源 | ||||
MiMo V2Free coding model; 256K context; open weights | Xiaomi | 2026-02 | 256K | Free | N/A | unknown | unknown | ~70 t/s t/s | 開源 |
Gemini 3.2 FlashFaster than Gemini 3.1 Pro with improved performance. | 128K | N/A | t/s | 閉源 | |||||
Devstral 2Cheapest agentic coding model; 256K context | Mistral | 2026-01 | 256K | $0.05/$0.22 | N/A | unknown | unknown | ~100 t/s t/s | 開源 |
DeepSeek V4-ProDeepSeek V4-Pro offers a significant price reduction, making it one of the most cost-effective options in the market. | DeepSeek | 128K | N/A | t/s | 閉源 | ||||
Llama 4 Maverick400B MoE (17B active); strong multimodal; open weights | Meta | 2025-04 | 1M | $0.15/$0.6 | N/A | unknown | unknown | ~60 t/s t/s | 開源 |
Arena ELO 1380+ Arena ELO 1350–1379 Arena ELO <1350 最佳 = 該欄最佳
Benchmark 數字為近似值,來源包含公開排行榜(LMSYS Chatbot Arena、官方文件)。價格以每百萬 tokens 的 input/output 美元顯示。速度為估算 tokens/sec,會因供應商而異。資料會自資料庫自動更新。