TurboVec: Rust vector index cuts 10M docs to 4GB
TurboVec is a Rust vector index with Python bindings that compresses 10M documents to 4GB RAM and adds filtered search, local-only RAG, and framework adapters.

TurboVec is a Rust vector index with Python bindings that compresses large corpora and supports filtered search.
10 million documents in 4 GB of RAM: that is the headline from RyanCodrai/turbovec, a Rust vector index with Python bindings built on Google Research’s TurboQuant method. The project says the same corpus needs 31 GB as float32, and that TurboVec searches it faster than FAISS in its published benchmarks.
| 項目 | 數值 |
|---|---|
| Corpus size | 10 million documents |
| RAM with float32 | 31 GB |
| RAM with TurboVec | 4 GB |
| Repository stars | 3.8k |
| Forks | 347 |
| Commits | 144 |
| Benchmark speedup on ARM | 12–20% over FAISS FastScan |
| Benchmark result on x86 | 1–6% faster on 4-bit configs |
What changed
Get the latest AI news in your inbox
Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.
No spam. Unsubscribe at any time.
TurboVec packages TurboQuant as a local-first index that can ingest vectors online, skip training, and avoid rebuilds as the corpus grows. It exposes both a simple TurboQuantIndex and an IdMapIndex for stable external IDs, plus write/load persistence for Python and Rust users.

The repo also adds filtered search inside the SIMD kernel. Users can pass an allowlist or slot bitmask at query time, and the index returns up to k results from only the allowed set. The project says this avoids over-fetching and preserves recall for selective filters.
- Rust core with Python bindings
- Online ingest, no separate train phase
- Filtered search with allowlists or bitmasks
- Local-only use for air-gapped or VPC deployments
- Adapters for LangChain, LlamaIndex, Haystack, and Agno
Why it matters
For developers building RAG systems, the pitch is lower memory use without handing data to a managed vector service. That makes the project relevant for privacy-sensitive apps, embedded deployments, and teams that need dense retrieval on modest hardware.

The benchmark claims are also aimed at a familiar comparison point. TurboVec says its hand-written NEON and AVX-512BW kernels beat FAISS IndexPQFastScan on ARM and hold close on x86, which could make it an attractive drop-in for teams already using FAISS-style workflows.
The broader question is whether TurboVec’s compression and filtered-search path hold up across real production corpora, not just the repo’s benchmark sets. If they do, the project gives teams a cheaper way to keep vector search local and memory-light.
// Related Articles
- [TOOLS]
Nvidia and LG turn AI plans into a playbook
- [TOOLS]
Ollama is the best free AI path in 2026 for real work
- [TOOLS]
This MLOps list turns chaos into a stack
- [TOOLS]
BentoML turns model serving into Python APIs
- [TOOLS]
Magenta RealTime 2 lets you score in the DAW
- [TOOLS]
Open-source AI tools beat Claude’s paid tiers on value