Most RAG tutorials jump straight to vectors and LLMs; this project shows why production systems start with solid search foundations and then augment with semantic layers. The repo is a week-by-week, hands-on course that teaches how to engineer a complete arXiv paper-curator RAG system you can run locally or in production-like environments.
What Sets It Apart
- Practical infrastructure-first path: teaches Docker Compose, FastAPI, PostgreSQL, OpenSearch, and Airflow so retrieval and pipeline reliability come before model glue. So what: you learn maintainable patterns used in real products rather than toy examples.
- Hybrid retrieval & chunking emphasis: implements BM25 keyword search, relevance filtering, and semantic vectors plus intelligent chunking. So what: the course demonstrates trade-offs between precision (BM25) and semantic recall (vectors) and how to combine them for robust retrieval.
- Production-grade observability and caching: includes monitoring hooks (Langfuse integration) and Redis caching patterns. So what: you get guidance on diagnosing RAG failure modes and reducing latency in realistic deployments.
- Agentic RAG capstone: week 7 integrates LangGraph agents, document grading, query rewriting, and a Telegram bot for mobile access. So what: you see how decision logic, guardrails, and provenance tracking fit into a delivered assistant.
Who This Fits / Trade-offs
Great fit if you want a hands-on curriculum that teaches engineering RAG systems end-to-end — from ingestion and keyword search to hybrid retrieval, local LLM streaming, and agent orchestration. Look elsewhere if you only need quick demos of vector search or model prompting: this course is heavier on infra, deployment patterns, and reliability trade-offs, requiring ~8GB RAM and time to provision services.
Where It Fits
This repository sits between LLM demo repos (which focus on prompts) and full enterprise platforms (which hide infra). It's best for engineers, ML/AI students, and small teams who want repeatable production patterns for RAG.
Quick Notes on Approach
The course emphasizes observability, reproducible pipelines, and iterative evaluation (document grading and relevance scoring), not just raw model capability. It illustrates concrete engineering decisions — e.g., when to prefer BM25 filtering, how to chunk and embed documents, and how to surface reasoning traces for debugging.
