Build an Agentic RAG system with LangGraph

OraCore Editors

Back to home

[AGENT] June 15, 20265 min readOraCore Editors

Build an Agentic RAG system with LangGraph

A modular LangGraph repo for building and learning Agentic RAG end to end.

LangGraph Ollama Qdrant RAGAS agentic RAG

Share LinkedIn

Build an Agentic RAG system with LangGraph

A modular LangGraph repo for building and learning Agentic RAG end to end.

This guide is for developers who want to turn a basic retrieval app into an agentic RAG system with query clarification, conversation memory, and multi-step reasoning. By the end, you will have a local project that can ingest PDFs, build a hybrid vector index, run LangGraph workflows, and answer questions with a configurable LLM provider.

The repository also doubles as a learning path, so you can start with the notebook flow and then move into the modular app when you are ready to adapt the pipeline for your own model, embeddings, or document format.

Before you start

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

Python 3.10 or 3.11
Git installed and working on your machine
Ollama installed locally, or API keys for OpenAI, Anthropic, or Google
Qdrant running locally or a Qdrant path on disk
At least one PDF file to index
Access to the repo docs on GitHub and LangGraph docs on LangGraph
Python packages from the repo requirements, including LangChain, Qdrant, and PyMuPDF

Step 1: Clone the repository

Your first goal is to get the project onto your machine so you can inspect the notebook path and the modular app path side by side.

git clone https://github.com/GiovanniPasq/agentic-rag-for-dummies.git
cd agentic-rag-for-dummies
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

You should see the repo folders such as notebooks, project, and assets, plus a successful dependency install with no missing-package errors.

Step 2: Configure the local LLM and storage

Your goal here is to make the app runnable by defining the model provider, embeddings, and local storage paths that the agent will use during retrieval.

export OPENAI_API_KEY="your-key"  # or ANTHROPIC_API_KEY / GOOGLE_API_KEY
ollama pull qwen3:4b-instruct-2507-q4_K_M
mkdir -p docs markdown_docs parent_store qdrant_db

You should see Ollama report that the model is available, and the folders for source PDFs, markdown output, parent chunks, and Qdrant storage should exist on disk.

Step 3: Convert PDFs into markdown

Your goal is to normalize source documents into markdown so the pipeline can chunk them consistently and preserve structure for retrieval.

python -c "from project.pdf_utils import pdfs_to_markdowns; pdfs_to_markdowns('docs/*.pdf')"

You should see markdown files appear in markdown_docs, and each PDF should have a corresponding .md file that is readable in a text editor.

Step 4: Build the hierarchical index

Your goal is to create the repo’s parent-child retrieval setup, where small chunks improve search precision and parent chunks restore context for generation.

python -c "from project.indexing import build_index; build_index()"

You should see the Qdrant collection created, child chunks stored in the vector database, and parent chunk JSON files written to parent_store.

Step 5: Run the LangGraph agent workflow

Your goal is to launch the agentic query pipeline so the system can rewrite ambiguous questions, clarify missing details, retrieve context, and synthesize the final answer.

python -m project.app

You should see the graph execute through stages such as conversation summary, query rewriting, retrieval, and response generation, then return an answer instead of a raw document dump.

Step 6: Test a multi-part question

Your goal is to verify that the agent can split a complex prompt into parallel sub-queries and merge the results into one coherent response.

What is JavaScript? What is Python?

You should see two retrieval paths run in parallel or in sequence through the graph, followed by a combined answer that addresses both topics clearly.

Metric	Before/Baseline	After/Result
Retrieval workflow	Basic single-pass RAG	Agentic RAG with clarification, self-correction, and map-reduce reasoning
Document context	Small chunks only	Hybrid parent and child chunks for precision plus context
Model support	One provider only	Ollama first, with OpenAI, Anthropic, and Google options

Common mistakes

Using a small local model that ignores tool instructions. Fix: switch to a 7B+ model or a stronger hosted chat model.
Skipping the markdown conversion step. Fix: convert PDFs first so chunking and metadata stay stable.
Forgetting to create Qdrant storage paths. Fix: create the local folders before running indexing, or point the client at the right database path.

What's next

Once the local flow works, extend it with your own document set, swap in a different embedding model or chat provider, and add tracing or evaluation with Langfuse and RAGAS so you can measure retrieval quality before you ship.

// Related Articles

Build an Agentic RAG system with LangGraph

Before you start

Get the latest AI news in your inbox

Step 1: Clone the repository

Step 2: Configure the local LLM and storage

Step 3: Convert PDFs into markdown

Step 4: Build the hierarchical index

Step 5: Run the LangGraph agent workflow

Step 6: Test a multi-part question

Common mistakes

What's next

Manus AI proves agents are ready for real work, but pricing will deci…

Coinbase is right to let AI agents trade and spend, with strict limits

PEFT for LLM Fine-Tuning Without Full Retraining

LLM research engineers turn post-training into services

Fine-Tuning SLMs Turns Enterprise AI Practical

Aspire ties Microsoft Agent Framework into one app graph