How to Use Mistral OCR with Python

OraCore Editors

Back to home

[TOOLS] May 17, 20267 min readOraCore Editors

How to Use Mistral OCR with Python

Use Mistral OCR to extract structured text, images, and tables from PDFs and scans.

API key Python Mistral OCR Markdown OCR

Share LinkedIn

Use Mistral OCR to extract structured text, images, and tables from PDFs and scans.

This guide is for developers who want to turn PDFs, scans, and document images into structured output with Mistral OCR. After following the steps, you will have a working Python setup that can OCR a remote PDF, upload a local file, preserve layout in Markdown, and save embedded images for downstream parsing.

You will also know how to verify the output, estimate cost per page, and avoid the most common integration mistakes when moving from a demo to a production workflow.

Before you start

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

Python 3.9+
pip 23+
A Mistral AI account on La Plateforme
A Mistral API key
Internet access for hosted OCR requests
Optional: a local PDF or JPG/PNG scan for testing
Optional: Git installed if you plan to commit a .env file and sample script

Install the official SDK and helper packages in a clean virtual environment so your OCR test runs do not conflict with other projects.

Step 1: Create a Python environment

Goal: build a clean runtime that can install the Mistral SDK without dependency conflicts. A virtual environment also makes it easy to reproduce the OCR setup later on another machine.

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install mistralai python-dotenv datauri

Verification: run python -c "import mistralai; print('ok')". You should see ok, which confirms the SDK is installed and importable.

Step 2: Store your API key securely

Goal: create a credential flow that keeps your key out of source code. Mistral OCR is accessed through the hosted API, so the client must authenticate before any document can be processed.

cat > .env <<'EOF'
MISTRAL_API_KEY=your_api_key_here
EOF

cat > .gitignore <<'EOF'
.env
.venv
EOF

Verification: open the Mistral docs and confirm your API key is active in the console. You should also see .env excluded from Git status.

Step 3: Initialize the OCR client

Goal: load the API key from the environment and create an authenticated client object. This is the point where your script becomes ready to call the OCR endpoint.

from dotenv import load_dotenv
from mistralai import Mistral
import os

load_dotenv()
api_key = os.environ["MISTRAL_API_KEY"]
client = Mistral(api_key=api_key)
print("client-ready")

Verification: run the script and look for client-ready. If the environment variable is missing, you should fix the .env file before moving on.

Step 4: OCR a remote PDF

Goal: extract structured Markdown from a public PDF URL. This is the fastest way to confirm that Mistral OCR is working end to end, because you do not need to upload a file first.

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={
        "type": "document_url",
        "document_url": "https://arxiv.org/pdf/2501.00663"
    }
)

print(len(ocr_response.pages))
print(ocr_response.pages[0].markdown[:800])

Verification: you should see a page count greater than zero and Markdown that starts with headings, paragraphs, or figure references. If the source PDF has multiple sections, those should appear in the extracted text rather than as one flat block.

Step 5: Upload a local file and save images

Goal: process a PDF that lives on your machine and preserve embedded figures. This step is important when your documents are private, not publicly hosted, or need to be handled from a local workflow.

from datauri import parse

uploaded = client.files.upload(
    file={"file_name": "report.pdf", "content": open("report.pdf", "rb")},
    purpose="ocr"
)
signed_url = client.files.get_signed_url(file_id=uploaded.id)

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={"type": "document_url", "document_url": signed_url.url},
    include_image_base64=True
)

for page in ocr_response.pages:
    for img in page.images:
        data = parse(img.image_base64)
        with open(img.id, "wb") as f:
            f.write(data.data)

Verification: you should see one or more image files saved to disk, such as img-0.jpeg. The Markdown output should still include the surrounding text and image references, which confirms layout preservation.

Step 6: Parse image scans and compare output quality

Goal: run OCR on a scan or phone photo and check whether the result keeps headings, table structure, and readable text. This step helps you understand where OCR quality is strong enough for automation and where you may need cleanup logic.

ocr_response = client.ocr.process(
    model="mistral-ocr-latest",
    document={
        "type": "image_url",
        "image_url": "https://example.com/receipt.png"
    }
)

print(ocr_response.pages[0].markdown)

Verification: you should see text output that matches the visible document, not just a raw character dump. On invoices or receipts, line items and totals should remain readable enough to feed into a parser or extraction pipeline.

Metric	Before/Baseline	After/Result
Accuracy across diverse documents	83.4% with Google Document AI	~94.9% with Mistral OCR
Accuracy across diverse documents	89.5% with Azure OCR	~94.9% with Mistral OCR
Throughput	Typical single-document manual handling	Up to 2,000 pages per minute on a single GPU node
Pricing	Manual review cost varies by team	About $1 per 1,000 pages, or $0.001 per page
Request limits	Ad hoc file handling	Up to 50 MB or 1,000 pages per request

Common mistakes

Using a missing or expired API key. Fix: recheck MISTRAL_API_KEY in .env and regenerate the key in the Mistral console if needed.
Passing a local file path directly to document_url. Fix: upload the file first, then use the signed URL returned by client.files.get_signed_url().
Expecting OCR output to be plain text only. Fix: read the Markdown structure and image references, then decide whether to keep headings, tables, and figures in your downstream pipeline.

What's next

Once the basic OCR flow works, the next step is to add chunking, table extraction, and validation rules so your application can turn OCR output into searchable knowledge, invoice fields, or retrieval-ready content. If you are building a larger document pipeline, pair this guide with a vector store, a parser for Markdown tables, and a review step for low-confidence pages.

// Related Articles

How to Use Mistral OCR with Python

Before you start

Get the latest AI news in your inbox

Step 1: Create a Python environment

Step 2: Store your API key securely

Step 3: Initialize the OCR client

Step 4: OCR a remote PDF

Step 5: Upload a local file and save images

Step 6: Parse image scans and compare output quality

Common mistakes

What's next

How to Build Rust GPU Kernels with cuda-oxide

Vector Databases: How AWS Explains Them

How to Choose a Vector Database in 2026

Vibe Research: AI Tools for Faster Research

Why AWS’s repository-wide security scanner matters more than faster S…

Why Docker’s microVM sandboxes are the right move for AI agents