7-stage local RAG engine over your codebase. Async orchestrator, sqlite-backed vector index, hybrid BM25+cosine retrieval with RRF fusion. ~3200 LOC.
| Stage | Role | LOC |
|---|---|---|
| Crawler | Walk repo respecting .gitignore, filter by language | 280 |
| Chunker | AST-aware chunks for Python, line-window for others | 320 |
| Embedder | Deterministic hash-sketch + TF-IDF, L2-normalized | 280 |
| VectorIndex | SQLite-backed BLOB vectors, FTS5 hybrid ranking | 340 |
| Retriever | Top-k via RRF score fusion (BM25 + cosine) | 280 |
| ContextPacker | Diversity-sampled context window with dedup | 260 |
| AnswerSynthesizer | Template synthesis with file:line citations | 240 |
The RAGEngine facade owns the lifecycle: index(repo_path) builds the FTS5+vector store, query(question) runs the 4 read-side stages and returns a RAGAnswer with citations.
- Async orchestrator dispatches sync stages via
asyncio.to_thread - Hybrid retrieval: BM25 (FTS5) + cosine (numpy on BLOB) fused via RRF
- AST-aware chunking preserves function/class boundaries for Python
- Incremental re-index by file mtime
- Cited answers: every claim links back to file path + line range
- Pluggable stages via
Stageinheritance - Pure stdlib + numpy — no model downloads, no torch
pip install -r requirements.txt
python -m src.cli index ./my_repo --db ./my.db
python -m src.cli query "how does authentication work?" --top-k 8 --db ./my.db
python -m src.cli stats --db ./my.dbDuring the design and implementation phase, this project consumed ~16M tokens/day across Hermes Agent, Claude Code, and Xiaomi MiMo V2.5 Pro for AST-aware chunking strategy iteration, RRF score-fusion design, hybrid retrieval tuning, and continuous test maintenance.
pytest tests/ -v108 tests covering all 7 stages, RAGEngine dispatch, sqlite vector index, RRF fusion, AST chunker, schemas, and the language/text/hashing utilities. Real I/O, no mocks — every test runs against a fresh :memory: store or a tmp_path repo.
src/
├── stages/ # 7 stages + Stage base class
├── storage/ # sqlite_store + schema.sql
├── io/ # file reader + language detection
├── models/ # RAGContext + dataclass schemas
├── utils/ # config, logger, hashing, text, metrics
├── engine.py # RAGEngine facade
└── cli.py # click CLI
config/default.yaml controls chunk size, embedding dim, top-k, RRF k constant, and per-stage timeouts. Override via --config or CODERAG_* env vars.
MIT
Built with: Hermes Agent, MiMo + Claude series