What is Context Window? — AI Glossary 2026

Definition

The maximum number of tokens a model can process in a single call — including both the input (prompt) and output (completion). Larger windows allow processing entire codebases, books, or long conversations. Measured in tokens, not characters.

Related Terms

Tokenizer

The component that converts raw text into tokens (integer IDs) that the model processes. Most modern LLMs use Byte-Pair Encoding (BPE) or similar subword algorithms. Token count determines cost and fits within the context window limit.

Embedding

A dense numerical vector that represents text, images, or other data in a high-dimensional space where semantic similarity maps to geometric closeness. Foundation of semantic search, RAG systems, and recommendation engines.

RAG (Retrieval-Augmented Generation)

An architecture that enhances LLM outputs by first retrieving relevant documents from a knowledge base (via vector search) and injecting them into the prompt. Grounds the model in external, up-to-date facts without requiring retraining.

Articles about Context Window

Gemini 1.5 Pro-002, Flash-002 and 2.0 Flash update Google AI

MiniMax M3 Proves Open-Weight Can Still Win on Coding

MemDreamer tackles long-video overload

Open Source RAG Stack Turns Chaos Into a Build Plan

Gemini 3.5 Flash Pricing, Context, Benchmarks

Definition

Related Terms

Articles about Context Window

All Terms