TurboVec: Rust vector index cuts 10M docs to 4GB

OraCore Editors

[TOOLS] May 31, 20263 min readOraCore Editors

TurboVec: Rust vector index cuts 10M docs to 4GB

TurboVec is a Rust vector index with Python bindings that compresses 10M documents to 4GB RAM and adds filtered search, local-only RAG, and framework adapters.

Rust vector search FAISS Python bindings TurboQuant

Share LinkedIn

TurboVec: Rust vector index cuts 10M docs to 4GB

TurboVec is a Rust vector index with Python bindings that compresses large corpora and supports filtered search.

10 million documents in 4 GB of RAM: that is the headline from RyanCodrai/turbovec, a Rust vector index with Python bindings built on Google Research’s TurboQuant method. The project says the same corpus needs 31 GB as float32, and that TurboVec searches it faster than FAISS in its published benchmarks.

項目	數值
Corpus size	10 million documents
RAM with float32	31 GB
RAM with TurboVec	4 GB
Repository stars	3.8k
Forks	347
Commits	144
Benchmark speedup on ARM	12–20% over FAISS FastScan
Benchmark result on x86	1–6% faster on 4-bit configs

What changed

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

TurboVec packages TurboQuant as a local-first index that can ingest vectors online, skip training, and avoid rebuilds as the corpus grows. It exposes both a simple TurboQuantIndex and an IdMapIndex for stable external IDs, plus write/load persistence for Python and Rust users.

The repo also adds filtered search inside the SIMD kernel. Users can pass an allowlist or slot bitmask at query time, and the index returns up to k results from only the allowed set. The project says this avoids over-fetching and preserves recall for selective filters.

Rust core with Python bindings
Online ingest, no separate train phase
Filtered search with allowlists or bitmasks
Local-only use for air-gapped or VPC deployments
Adapters for LangChain, LlamaIndex, Haystack, and Agno

Why it matters

For developers building RAG systems, the pitch is lower memory use without handing data to a managed vector service. That makes the project relevant for privacy-sensitive apps, embedded deployments, and teams that need dense retrieval on modest hardware.

The benchmark claims are also aimed at a familiar comparison point. TurboVec says its hand-written NEON and AVX-512BW kernels beat FAISS IndexPQFastScan on ARM and hold close on x86, which could make it an attractive drop-in for teams already using FAISS-style workflows.

The broader question is whether TurboVec’s compression and filtered-search path hold up across real production corpora, not just the repo’s benchmark sets. If they do, the project gives teams a cheaper way to keep vector search local and memory-light.

// Related Articles

TurboVec: Rust vector index cuts 10M docs to 4GB

What changed

Get the latest AI news in your inbox

Why it matters

Nvidia and LG turn AI plans into a playbook

Ollama is the best free AI path in 2026 for real work

This MLOps list turns chaos into a stack

BentoML turns model serving into Python APIs

Magenta RealTime 2 lets you score in the DAW

Open-source AI tools beat Claude’s paid tiers on value