[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-5-turboquant-lessons-for-vector-search-teams-en":3,"article-related-5-turboquant-lessons-for-vector-search-teams-en":33,"series-industry-034b5552-6ad2-4a5f-960c-870f30d7be22":85},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":25,"views":29,"created_at":30,"published_at":31,"topic_cluster_id":32},"034b5552-6ad2-4a5f-960c-870f30d7be22","5-turboquant-lessons-for-vector-search-teams-en","5 TurboQuant lessons for vector search teams","\u003Cp data-speakable=\"summary\">\u003Ca href=\"\u002Ftag\u002Fturboquant\">TurboQuant\u003C\u002Fa> can cut vector memory while keeping search quality steadier than simpler quantizers.\u003C\u002Fp>\u003Cp>This guide turns one Qdrant experiment into five practical lessons, using a 1536-dimension embedding as the memory baseline.\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Item\u003C\u002Fth>\u003Cth>Compression\u003C\u002Fth>\u003Cth>Typical tradeoff\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Scalar quantization\u003C\u002Ftd>\u003Ctd>4x\u003C\u002Ftd>\u003Ctd>Small recall loss, easy to run\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Binary quantization\u003C\u002Ftd>\u003Ctd>32x\u003C\u002Ftd>\u003Ctd>Very low memory, higher instability\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>TurboQuant 4-bit\u003C\u002Ftd>\u003Ctd>8x\u003C\u002Ftd>\u003Ctd>Better geometry preservation than plain low-bit compression\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>TurboQuant 2-bit\u003C\u002Ftd>\u003Ctd>16x\u003C\u002Ftd>\u003Ctd>More storage savings, more accuracy risk\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>1. What quantization really buys you\u003C\u002Fh2>\u003Cp>Quantization is not just a storage trick. It changes how much vector data you can keep in memory, which matters fast once embeddings get large. A 1536-dimension float32 vector takes about 6 KB, so one million vectors can consume roughly 6 GB before you even talk about index overhead.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780157892244-w7me.png\" alt=\"5 TurboQuant lessons for vector search teams\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The basic idea is simple: store fewer bits per value, accept some error, and hope retrieval quality stays good enough. Scalar quantization usually maps values into 256 bins and stores them as bytes, which gives about 4x compression. Push harder, and the savings rise while the chance of recall loss rises too.\u003C\u002Fp>\u003Cul>\u003Cli>Float32: highest fidelity, highest memory use\u003C\u002Fli>\u003Cli>Scalar: common default, moderate savings\u003C\u002Fli>\u003Cli>Binary: extreme compression, weakest shape preservation\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>2. Why TurboQuant starts with rotation\u003C\u002Fh2>\u003Cp>TurboQuant changes the order of operations. Instead of compressing the vector as-is, it rotates the vector first so that signal gets spread more evenly across dimensions. That matters because many embeddings carry more useful information in some coordinates than others, and plain quantizers do not account for that unevenness.\u003C\u002Fp>\u003Cp>The rotation does not change distances by itself. It changes where the information sits, making the vector easier to compress without throwing away as much geometry. In Qdrant’s implementation, this is paired with a precomputed codebook and a scoring correction that helps offset the shrinkage introduced by quantization.\u003C\u002Fp>\u003Cul>\u003Cli>Rotation spreads energy across dimensions\u003C\u002Fli>\u003Cli>Quantization happens after the vector is easier to encode\u003C\u002Fli>\u003Cli>Length renormalization helps correct score bias\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>3. Where TurboQuant beats plain low-bit compression\u003C\u002Fh2>\u003Cp>The strongest case for TurboQuant is not that it uses fewer bits than every other method. The stronger case is that it tends to spend those bits more intelligently. A rotated vector is less lopsided, so a compact code can preserve more useful structure than a direct low-bit mapping of the original coordinates.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780157892526-dr69.png\" alt=\"5 TurboQuant lessons for vector search teams\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>That makes TurboQuant appealing when you want a better balance of memory and recall than binary quantization, but do not want the tuning burden of product quantization. Qdrant’s 1.18 release also makes the feature easier to try in an existing collection, which lowers the cost of testing it in production-like settings.\u003C\u002Fp>\u003Cul>\u003Cli>Good fit: teams that want lower memory without a huge recall drop\u003C\u002Fli>\u003Cli>Good fit: workloads where vector geometry matters more than raw compression\u003C\u002Fli>\u003Cli>Less ideal: cases that already tolerate very aggressive quality loss\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>4. What the bit depths mean in practice\u003C\u002Fh2>\u003Cp>TurboQuant is not one setting. Qdrant exposes several bit-depth options, including bits4, bits2, bits1.5, and bits1. Lower bit depth means stronger compression, but it also increases the chance that the encoded vector drifts away from the original one. That is the central tradeoff in the article’s experiments.\u003C\u002Fp>\u003Cp>For teams deciding where to start, 4-bit is the safest first test. It usually gives a meaningful space reduction while keeping the result closer to the original geometry than the more aggressive options. From there, you can step down only if your recall metrics still hold.\u003C\u002Fp>\u003Cul>\u003Cli>bits4: best first trial for most teams\u003C\u002Fli>\u003Cli>bits2: useful when memory pressure is stronger\u003C\u002Fli>\u003Cli>bits1.5 and bits1: only for very tight storage budgets\u003C\u002Fli>\u003C\u002Ful>\u003Ccode>client.create_collection(\n  collection_name=\"my_collection\",\n  vectors_config=models.VectorParams(size=1536, distance=models.Distance.COSINE),\n  quantization_config=models.TurboQuantization(\n    turbo=models.TurboQuantQuantizationConfig(\n      bits=models.TurboQuantBitSize.BITS4,\n      always_ram=True,\n    )\n  ),\n)\u003C\u002Fcode>\u003Ch2>5. What the benchmark question should be\u003C\u002Fh2>\u003Cp>The right \u003Ca href=\"\u002Ftag\u002Fbenchmark\">benchmark\u003C\u002Fa> is not “Which method compresses the most?” It is “Which method keeps recall stable enough for my data and query pattern?” That is why the article compares TurboQuant with scalar and binary quantization across multiple dataset sizes rather than treating one result as universal.\u003C\u002Fp>\u003Cp>If your vectors are small in number or your quality bar is strict, a conservative quantizer may still be the better default. If your index is \u003Ca href=\"\u002Fnews\u002F5-reasons-cursor-is-growing-so-fast-en\">growing fast\u003C\u002Fa> and you need more room in memory, TurboQuant is worth testing before you jump to harsher compression. The point is not to pick the most advanced option, but the option that keeps your search behavior predictable.\u003C\u002Fp>\u003Cul>\u003Cli>Benchmark recall at your own scale, not just on toy data\u003C\u002Fli>\u003Cli>Check whether score bias changes ranking behavior\u003C\u002Fli>\u003Cli>Compare memory savings against latency and quality together\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>How to decide\u003C\u002Fh2>\u003Cp>Pick scalar quantization if you want a simple, familiar default with mild compression. Pick binary only if memory pressure is extreme and you can tolerate a larger quality hit. Pick TurboQuant when you want a middle path: stronger compression than scalar, but less instability than the most aggressive low-bit methods.\u003C\u002Fp>\u003Cp>If you are unsure, start with TurboQuant 4-bit on one collection, measure recall on your real queries, and only move lower if the numbers stay acceptable. That is the safest way to see whether it is a fit for your own vector search system.\u003C\u002Fp>","5 takeaways on Qdrant TurboQuant: how rotation changes compression, where recall holds up, and when safer quantizers fit better.","towardsdatascience.com","https:\u002F\u002Ftowardsdatascience.com\u002Fqdrant-turboquant-explained-is-turboquant-the-silver-bullet\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780157892244-w7me.png","industry","en","e4150272-a31a-45c4-b63c-91095bebfb82",[17,18,19,20,21,22,23,24],"Qdrant","TurboQuant","quantization","vector search","embedding compression","recall","binary quantization","scalar quantization",[26,27,28],"TurboQuant rotates vectors before compression, which helps preserve geometry better than plain low-bit encoding.","4-bit TurboQuant is the safest starting point for most teams testing memory savings.","The best choice depends on your recall target, dataset size, and how much instability you can tolerate.",1,"2026-05-30T16:17:39.721708+00:00","2026-05-30T16:17:39.713+00:00","d19fc184-5852-4c4d-9ec0-db0c4841ac17",{"tags":34,"relatedLang":44,"relatedPosts":48},[35,37,38,40,42],{"name":21,"slug":36},"embedding-compression",{"name":19,"slug":19},{"name":20,"slug":39},"vector-search",{"name":17,"slug":41},"qdrant",{"name":18,"slug":43},"turboquant",{"id":15,"slug":45,"title":46,"language":47},"5-turboquant-zh","5 個 TurboQuant 向量搜尋重點","zh",[49,55,61,67,73,79],{"id":50,"slug":51,"title":52,"cover_image":53,"image_url":53,"created_at":54,"category":13},"47702da7-3093-408a-90aa-9f5f461ccce9","openai-ipo-filing-turns-hype-into-scrutiny-en","OpenAI’s IPO filing turns hype into scrutiny","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781042611120-ynji.png","2026-06-09T22:03:05.09084+00:00",{"id":56,"slug":57,"title":58,"cover_image":59,"image_url":59,"created_at":60,"category":13},"619fab96-00b8-42f2-a3ff-13db32d6ac7b","skatteetaten-public-sector-ai-outcomes-en","Skatteetaten proves public sector AI should be judged by outcomes","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781038981764-h8ac.png","2026-06-09T21:02:32.623368+00:00",{"id":62,"slug":63,"title":64,"cover_image":65,"image_url":65,"created_at":66,"category":13},"45465fba-7f0e-4e19-979f-7902a8fc405a","openai-ipo-filing-wall-street-test-en","OpenAI’s IPO filing puts AI’s biggest test on Wall Street","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781032672165-bxm6.png","2026-06-09T19:17:23.738005+00:00",{"id":68,"slug":69,"title":70,"cover_image":71,"image_url":71,"created_at":72,"category":13},"bd36b287-03a0-46bf-b06d-661e82cb9cda","openai-latest-moves-pricing-safety-scale-en","OpenAI’s latest moves now center on pricing, safety, and scale","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781031776502-556w.png","2026-06-09T19:02:27.3401+00:00",{"id":74,"slug":75,"title":76,"cover_image":77,"image_url":77,"created_at":78,"category":13},"de1ca935-bcb1-48c5-901f-cc1ae841145b","risc-v-mini-pcs-worth-buying-now-future-bet-en","RISC-V mini PCs are worth buying now, but only as a bet on the future","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781026385311-ujek.png","2026-06-09T17:32:31.892173+00:00",{"id":80,"slug":81,"title":82,"cover_image":83,"image_url":83,"created_at":84,"category":13},"e57d8e32-a12b-45a9-bf9a-d58abecec3c0","fedora-44-risc-v-widens-linux-board-support-en","Fedora 44 RISC-V widens Linux board support","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781025488724-g6ma.png","2026-06-09T17:17:24.883927+00:00",[86,91,96,101,106,111,116,121,126,131],{"id":87,"slug":88,"title":89,"created_at":90},"d35a1bd9-e709-412e-a2df-392df1dc572a","ai-impact-2026-developments-market-en","AI's Impact in 2026: Key Developments and Market Shifts","2026-03-25T16:20:33.205823+00:00",{"id":92,"slug":93,"title":94,"created_at":95},"5ed27921-5fd6-492e-8c59-78393bf37710","trumps-ai-legislative-framework-en","Trump's AI Legislative Framework: What's Inside?","2026-03-25T16:22:20.005325+00:00",{"id":97,"slug":98,"title":99,"created_at":100},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","2026-03-25T16:22:47.882615+00:00",{"id":102,"slug":103,"title":104,"created_at":105},"0ebb5b16-774a-4922-945d-5f2ce1df5a6d","claude-usage-diversifies-learning-curves-en","Claude Usage Diversifies, Learning Curves Emerge","2026-03-25T16:25:50.770376+00:00",{"id":107,"slug":108,"title":109,"created_at":110},"69934e86-2fc5-4280-8223-7b917a48ace8","openclaw-ai-commoditization-concerns-en","OpenClaw's Rise Raises Concerns of AI Model Commoditization","2026-03-25T16:26:30.582047+00:00",{"id":112,"slug":113,"title":114,"created_at":115},"b4b2575b-2ac8-46b2-b90e-ab1d7c060797","google-gemini-ai-rollout-2026-en","Google's Gemini AI Rollout Extended to 2026","2026-03-25T16:28:14.808842+00:00",{"id":117,"slug":118,"title":119,"created_at":120},"6e18bc65-42ae-4ad0-b564-67d7f66b979e","meta-llama4-fabricated-results-scandal-en","Meta's Llama 4 Scandal: Fabricated AI Test Results Unveiled","2026-03-25T16:29:15.482836+00:00",{"id":122,"slug":123,"title":124,"created_at":125},"bf888e9d-08be-4f47-996c-7b24b5ab3500","accenture-mistral-ai-deployment-en","Accenture and Mistral AI Team Up for AI Deployment","2026-03-25T16:31:01.894655+00:00",{"id":127,"slug":128,"title":129,"created_at":130},"5382b536-fad2-49c6-ac85-9eb2bae49f35","mistral-ai-high-stakes-2026-en","Mistral AI: Facing High Stakes in 2026","2026-03-25T16:31:39.941974+00:00",{"id":132,"slug":133,"title":134,"created_at":135},"9da3d2d6-b669-4971-ba1d-17fdb3548ed5","cursors-meteoric-rise-pressures-en","Cursor's Meteoric Rise Faces Industry Pressures","2026-03-25T16:32:21.899217+00:00"]