Run a loan-routing agent in your browser, watch every decision in a live dashboard, and trace every LLM call in MLflow — all on your machine.
| Layer | Stack | Port |
|---|---|---|
| UI | Next.js 15.1.6 · React 19 |
:3000 |
| API | Node.js 22.14.0 · Express · LangGraph.js |
:4000 |
| LLM gateway | LiteLLM Proxy 1.71.3 |
:4001 |
| Observability | MLflow 2.20.1 |
:5000 |
| Database | PostgreSQL 16.6 |
:5432 |
| Local model | Ollama llama3.2 |
:11434 |
This guide walks you through everything from a fresh clone to your first traced chat.
Commands assume Windows PowerShell from the repo root unless noted. On macOS/Linux, swap Copy-Item for cp and use bash where paths differ.
When setup is complete you will have:
- http://localhost:3000 — chat UI + live metrics dashboard (sessions, routing breakdown, recent runs)
- http://localhost:4000 — REST API running the 6-step agent graph
- http://localhost:5000 — MLflow UI with agent runs and LLM traces
- PostgreSQL — every chat persisted for the dashboard and
/runshistory
flowchart LR
A[Install tools] --> B[Ollama + secrets]
B --> C[Docker stack]
C --> D[DB migrate]
D --> E[Backend API]
E --> F[Frontend UI]
F --> G[First chat + MLflow]
Estimated time: 30–45 minutes on first run (mostly Docker pull + ollama pull).
- Prerequisites
- Step-by-step setup
- Verify it works
- Daily use: start, stop, restart
- What you are building
- Configuration reference
- Troubleshooting
- Production deployment
- Contributors & specs
- Further reading
Install and verify each tool before cloning the repo.
| Tool | Version | Install | Verify |
|---|---|---|---|
| Git | Recent | git-scm.com | git --version |
| Node.js | 22.14.0 |
nodejs.org or nvm-windows | node -v |
| npm | 10.9.2 |
Bundled with Node | npm -v |
| Docker Desktop | Latest stable | docker.com/products/docker-desktop | docker compose version |
| Ollama | Latest | ollama.com/download | ollama --version |
Match Node with
[.nvmrc](.nvmrc):nvm use 22.14.0(Windows: nvm-windows · macOS/Linux: nvm or fnm).
Disk & RAM: ~8 GB free disk for images + model; 8 GB RAM minimum (16 GB recommended while Ollama runs).
Track your progress:
| Step | Task | Done |
|---|---|---|
| 0 | Clone repo + install root hooks | ☐ |
| 1 | Start Ollama with llama3.2 |
☐ |
| 2 | Generate local secrets | ☐ |
| 3 | Create frontend/.env.local |
☐ |
| 4 | Start Docker (Postgres, MLflow, LiteLLM) | ☐ |
| 5 | Run database migrations | ☐ |
| 6 | Start backend API | ☐ |
| 7 | Start frontend | ☐ |
| 8 | Send first chat + check MLflow | ☐ |
git clone <your-repo-url> agentflow-mlflow
cd agentflow-mlflow
nvm use 22.14.0
npm cinpm ci at the repo root installs Husky pre-commit hooks (lint on staged files).
Success: node -v prints v22.14.0 and you are in the repo root.
Ollama runs on your host, not inside Docker. LiteLLM in Docker forwards requests to it.
Open a dedicated terminal and leave it running:
ollama pull llama3.2
ollama serveQuick check (new terminal is fine):
curl http://localhost:11434/api/tagsSuccess: JSON listing models including llama3.2.
Linux Docker note: LiteLLM uses
host.docker.internalin[litellm/config.yaml](litellm/config.yaml).docker-compose.ymlalready addsextra_hostsfor Linux. If Ollama is unreachable from the container, setapi_basetohttp://172.17.0.1:11434.
Never commit real API keys. This script writes gitignored env files with a matching LiteLLM proxy key:
powershell -NoProfile -ExecutionPolicy Bypass -File scripts/generate-local-secrets.ps1Creates:
| File | Purpose |
|---|---|
.env |
LITELLM_MASTER_KEY for the LiteLLM Docker container |
backend/.env |
API settings: DATABASE_URL, MLFLOW_TRACKING_URI, LITELLM_API_KEY, etc. |
Committed templates (placeholders only): [.env.example](.env.example), [backend/.env.example](backend/.env.example).
Success: Both files exist. LITELLM_API_KEY in backend/.env equals LITELLM_MASTER_KEY in root .env.
If you ever get LiteLLM
401, re-run this script and restart Docker + backend so keys stay in sync.
Copy-Item frontend\.env.example frontend\.env.localContents (for reference):
NEXT_PUBLIC_API_BASE_URL=http://localhost:4000
NEXT_PUBLIC_MLFLOW_UI_URL=http://localhost:5000Success: frontend/.env.local exists (gitignored).
From the repo root:
docker compose up -dThis starts PostgreSQL, MLflow, and LiteLLM (reads [litellm/config.yaml](litellm/config.yaml) — routes agentflow-chat → Ollama and enables MLflow trace callbacks).
Wait ~30 seconds, then verify:
docker compose ps
curl http://localhost:5000/health
curl http://localhost:4001/healthSuccess: All three containers are running (or healthy). MLflow and LiteLLM health endpoints return OK.
Open http://localhost:5000 — the UI loads. Experiment agentflow-loan-chat appears after the first LLM call.
cd backend
npm ci
npx prisma migrate deploy
cd ..Success: Migrations apply without error. Tables chat_sessions and chat_messages exist.
Use a fresh terminal (avoids stale env vars from other projects or test runs):
cd backend
# Optional: clear leaked CI/test env that override backend/.env
Remove-Item Env:MLFLOW_TRACKING_URI -ErrorAction SilentlyContinue
Remove-Item Env:LITELLM_API_KEY -ErrorAction SilentlyContinue
Remove-Item Env:LITELLM_BASE_URL -ErrorAction SilentlyContinue
npm run devSuccess: Log line API listening on port 4000.
Smoke the API (another terminal):
curl http://localhost:4000/health
curl http://localhost:4000/health/mlflow
curl http://localhost:4000/health/litellm
curl http://localhost:4000/api/v1/dashboard/statsAll should return 200. Dashboard stats may show zeros until you send chats.
New terminal:
cd frontend
npm ci
npm run devSuccess: http://localhost:3000 loads the Loan routing command center — dashboard sidebar + chat panel.
- Open http://localhost:3000
- Send:
I need a loan application processed ASAP for home equity. - Confirm:
- Step timeline shows nodes
node_1…node_6 - Final status message appears
- Dashboard metrics update (sessions, routing bars)
- Open http://localhost:3000/runs — session appears in history
- Open http://localhost:5000 → experiment
agentflow-loan-chat:
- Runs — routing params, metrics,
trace.jsonartifact - Traces — LiteLLM LLM spans
Try an off-topic message to see early rejection:
How do I fix my broken television screen?
Expected: rejected at node_1, no full graph run.
You are done. Continue to Verify it works for scripted checks.
With the full stack running (Steps 1, 4, 6, 7):
powershell -NoProfile -ExecutionPolicy Bypass -File scripts/smoke-production.ps1Checks: health · loan chat · MLflow run params · off-topic rejection.
| Script | Validates |
|---|---|
[scripts/smoke-production.ps1](scripts/smoke-production.ps1) |
End-to-end health + chat + MLflow |
[scripts/verify-e2e-chat-mlflow.ps1](scripts/verify-e2e-chat-mlflow.ps1) |
Chat → MLflow agent run + traces |
[scripts/verify-e2e-routing.ps1](scripts/verify-e2e-routing.ps1) |
Full loan path vs TV early rejection |
[scripts/verify-e2e-cors.ps1](scripts/verify-e2e-cors.ps1) |
Browser CORS from :3000 |
Manual walkthrough: docs/e2e.md
# Full loan path
Invoke-RestMethod -Uri http://localhost:4000/api/v1/chat -Method POST `
-ContentType "application/json" `
-Body '{"message":"I need a loan application processed ASAP for home equity."}'
# Early rejection
Invoke-RestMethod -Uri http://localhost:4000/api/v1/chat -Method POST `
-ContentType "application/json" `
-Body '{"message":"How do I fix my broken television screen?"}'
# Dashboard aggregates from PostgreSQL
Invoke-RestMethod http://localhost:4000/api/v1/dashboard/statsExpect 200, mlflowRunId, and steps[] on chat responses.
cd backend; npm run lint; npm test
cd ../frontend; npm run lint; npm test
cd frontend; npm run build
powershell -NoProfile -ExecutionPolicy Bypass -File ../scripts/verify-no-secrets.ps1CI mirrors this: [.github/workflows/backend-ci.yml](.github/workflows/backend-ci.yml) · [.github/workflows/frontend-ci.yml](.github/workflows/frontend-ci.yml)
| Terminal | Command |
|---|---|
| 1 — Ollama | ollama serve (if not already running) |
| 2 — Docker | docker compose up -d (from repo root) |
| 3 — Backend | cd backend; npm run dev |
| 4 — Frontend | cd frontend; npm run dev |
# Frontend / backend: Ctrl+C in their terminals
# Docker stack (keeps data volumes)
docker compose down
# Ollama: Ctrl+C in its terminalTo free ports forcefully (PowerShell):
Get-NetTCPConnection -LocalPort 3000,4000 -State Listen -ErrorAction SilentlyContinue |
ForEach-Object { Stop-Process -Id $_.OwningProcess -Force -ErrorAction SilentlyContinue }Stopping MLflow on
:5000is done viadocker compose down. Avoid killingcom.docker.backenddirectly — that stops Docker Desktop entirely.
- Use fresh terminals for
npm run devso oldMLFLOW_TRACKING_URIorLITELLM_API_KEYvalues do not overridebackend/.env. - After re-running
generate-local-secrets.ps1, restart LiteLLM:docker compose restart litellmand restart the backend.
flowchart LR
subgraph Browser
U[You]
end
subgraph Frontend["Next.js :3000"]
Dash[Live dashboard]
Chat[Agent chat]
Runs[Run history /runs]
end
subgraph Backend["Node.js :4000"]
API[Express + Zod]
Graph[LangGraph 6-step]
MLLog[MLflow logger]
end
subgraph Docker
PG[(PostgreSQL)]
MLF[MLflow :5000]
LLM[LiteLLM :4001]
end
subgraph Host
Ollama[Ollama :11434]
end
U --> Dash
U --> Chat
U --> Runs
Dash --> API
Chat --> API
Runs --> API
API --> Graph
Graph --> LLM
LLM --> Ollama
LLM -.->|traces| MLF
Graph --> MLLog
MLLog --> MLF
API --> PG
Two MLflow layers:
| Layer | Source | Where in UI | Captured |
|---|---|---|---|
| Agent run | Node chatService |
Runs tab | Routing params, metrics, trace.json |
| LLM trace | LiteLLM success_callback |
Traces tab | Prompts, tokens, latency |
The frontend is a thin client — no agent logic, no LLM keys. Credentials live only in LiteLLM config / env.
Ported from [agentflow_6steps_loan.ipynb](agentflow_6steps_loan.ipynb). ML threshold: 25000. graph_version=2.0.
flowchart TD
START([START]) --> N1[node_1 · Finance?]
N1 -->|No| REJ[initial_rejection]
N1 -->|Yes| N2[node_2 · Type A/B]
N2 --> N3[node_3 · ML regressor]
N3 -->|> 25000| HV[HighValue]
N3 -->|≤ 25000| LV[LowValue]
HV --> N4
LV --> N4[node_4 · Fast_Track / Audit]
N4 --> N5[node_5 · Risk screening]
N5 -->|Fraud_Flag| DEN[final_denied]
N5 -->|Clear / Elevated| N6[node_6 · Final decision]
N6 -->|Approved| APP[final_approved]
N6 -->|Denied| DEN
agentflow-mlflow/
├── docker-compose.yml # postgres + mlflow + litellm
├── litellm/config.yaml # agentflow-chat → Ollama + MLflow callback
├── scripts/ # secrets, smoke, E2E, verify-no-secrets
├── frontend/ # Next.js dashboard (:3000)
├── backend/ # Express + LangGraph (:4000)
│ ├── prisma/ # chat_sessions, chat_messages
│ └── Dockerfile # production API image
├── docs/ # e2e, deployment, compliance, governance
├── agentflow_6steps_loan.ipynb
└── .cursor/ # specs, plan, constitution
| Service | URL |
|---|---|
| Frontend | http://localhost:3000 |
| API | http://localhost:4000 |
| LiteLLM | http://localhost:4001 |
| MLflow | http://localhost:5000 |
| PostgreSQL | localhost:5432 (user ### / pass ### / db ###) |
| Ollama | http://localhost:11434 |
| Method | Path | Purpose |
|---|---|---|
GET |
/health |
API liveness |
GET |
/health/mlflow |
MLflow connectivity |
GET |
/health/litellm |
LiteLLM connectivity |
POST |
/api/v1/chat |
Run agent graph |
GET |
/api/v1/sessions |
Paginated session list |
GET |
/api/v1/sessions/:id |
Session detail |
GET |
/api/v1/dashboard/stats |
Live metrics from PostgreSQL |
Full contract: [.cursor/plan.md](.cursor/plan.md) §4.2
| Setting | Default |
|---|---|
| Alias | agentflow-chat |
| Upstream | ollama/llama3.2 via host.docker.internal:11434 |
Backend LLM_MODEL |
agentflow-chat |
Test LiteLLM → MLflow tracing:
$key = (Get-Content .env | Where-Object { $_ -match '^LITELLM_MASTER_KEY=' }) -replace 'LITELLM_MASTER_KEY=',''
Invoke-RestMethod -Uri http://localhost:4001/v1/chat/completions -Method POST `
-Headers @{ Authorization = "Bearer $key" } `
-ContentType "application/json" `
-Body '{"model":"agentflow-chat","messages":[{"role":"user","content":"Hello"}]}'Check MLflow Traces tab for the span.
Exact pins only (no ^ / ~) per governance. Key runtime:
| Component | Version |
|---|---|
| Node.js | 22.14.0 |
| Next.js / React | 15.1.6 / 19.0.0 |
| Prisma | 6.3.0 |
| MLflow image | v2.20.1 |
| LiteLLM image | main-v1.71.3-stable |
| PostgreSQL | 16.6-alpine |
| Symptom | Likely cause | Fix |
|---|---|---|
docker compose cannot connect |
Docker Desktop not running | Start Docker Desktop; wait for "Engine running" |
| LiteLLM cannot reach Ollama | Ollama stopped or wrong api_base |
Run ollama serve; check [litellm/config.yaml](litellm/config.yaml) |
Backend 500 on chat; MLflow 127.0.0.1:9 in logs |
Stale shell env overrides .env |
Fresh terminal; Remove-Item Env:MLFLOW_TRACKING_URI etc.; restart backend |
LiteLLM 401 |
Key mismatch | Re-run scripts/generate-local-secrets.ps1; docker compose restart litellm; restart backend |
| Frontend "API base URL" error | Missing .env.local |
Step 3 |
| CORS error in browser | Wrong origin | CORS_ORIGIN=http://localhost:3000 in backend/.env |
| Dashboard empty | No chats yet | Send a message; check GET /api/v1/dashboard/stats |
| No MLflow Traces | Callback missing | Verify success_callback: ["mlflow"] in litellm/config.yaml; restart litellm |
| Chat very slow | Cold Ollama model | Pre-warm: ollama run llama3.2 "hi" |
Can't reach database in tests |
Postgres not up | docker compose up -d postgres |
Full runbook: docs/deployment.md
Production infra changes (managed Postgres, MLflow, LiteLLM, DNS, TLS) require explicit human authorization per
[.cursor/AGENTS.md](.cursor/AGENTS.md). Never commit production secrets. Use[backend/.env.production.example](backend/.env.production.example)and[frontend/.env.production.example](frontend/.env.production.example)as templates.
| Component | Approach |
|---|---|
| API | docker build -t agentflow-api:0.1.0 ./backend — inject env at runtime |
| Database | npx prisma migrate deploy before first traffic |
| Frontend | npm run build with production NEXT_PUBLIC_* |
| Verify | scripts/smoke-production.ps1 with SMOKE_API_BASE / SMOKE_MLFLOW_BASE |
Governance: docs/governance.md · Security: SECURITY.md · Audit: docs/compliance.md · Changelog: CHANGELOG.md
Implementation is complete (v0.1.0). To extend the system, follow the spec chain:
| File | Role |
|---|---|
[.cursor/constitution.md](.cursor/constitution.md) |
Stack rules |
[.cursor/AGENTS.md](.cursor/AGENTS.md) |
Agent harness, security |
[.cursor/plan.md](.cursor/plan.md) |
Schema, API, agent graph |
[.cursor/specs/spec.md](.cursor/specs/spec.md) |
Task definitions |
[.cursor/specs/implementation.md](.cursor/specs/implementation.md) |
Code snippets per task |
Cursor prompt (single task):
Read .cursor/specs/tasks.md — execute the Active task cycle only.
Spec authority: .cursor/specs/spec.md (task Txxx).
Reference agentflow_6steps_loan.ipynb for agent behavior.
| Resource | Description |
|---|---|
| docs/governance.md | Policy index, hooks, verification commands |
| docs/e2e.md | Manual E2E verification |
| docs/deployment.md | Production deploy runbook |
[agentflow_6steps_loan.ipynb](agentflow_6steps_loan.ipynb) |
Source graph + MLflow lifecycle |
| LiteLLM + MLflow tracing | Official integration docs |
Version: 0.1.0 · License: Internal demo project