Skip to content

devloopcode/fastapi-ai-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Backend

AI backend platform built with FastAPI, featuring RAG pipelines, vector search, streaming AI responses, and scalable async architecture.


Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      FastAPI App                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
β”‚  β”‚  Routes   β”‚  β”‚   Auth   β”‚  β”‚  Health  β”‚             β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β”‚       β”‚              β”‚                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”                            β”‚
β”‚  β”‚      Service Layer      β”‚                            β”‚
β”‚  β”‚  AuthService β”‚ AIServiceβ”‚                            β”‚
β”‚  β”‚  FileService β”‚ SubService                            β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜                            β”‚
β”‚       β”‚             β”‚                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”‚
β”‚  β”‚  Repos  β”‚   β”‚     AI Pipeline        β”‚              β”‚
β”‚  β”‚  (DB)   β”‚   β”‚  Embed β†’ Index β†’ RAG   β”‚              β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β”‚
β”‚       β”‚             β”‚                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚PostgreSQLβ”‚  β”‚   Qdrant    β”‚   β”‚   OpenAI API   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
    β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚    Redis + Celery Workers  β”‚
    β”‚  Embedding β”‚ Indexing β”‚ GC β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech Stack

Layer Technology
Framework FastAPI + Uvicorn
Database PostgreSQL 16 + SQLAlchemy 2.0 Async
Cache / Broker Redis 7
Vector DB Qdrant
AI Provider OpenAI (GPT-4o + text-embedding-3-small)
Background Tasks Celery
Auth JWT (access + refresh tokens)
Validation Pydantic v2
Migrations Alembic
Testing Pytest + pytest-asyncio + HTTPX
Containerization Docker + Docker Compose

Features

Authentication

  • JWT access tokens (30-minute expiry)
  • Refresh token rotation with secure hashing
  • bcrypt password hashing
  • Role-based access control (user / admin)

AI Capabilities

  • Document Chat β€” RAG-powered Q&A over uploaded documents
  • Resume Analyzer β€” Structured resume feedback with optional job description matching
  • Code Review β€” Security, performance, and quality analysis
  • Meeting Summarizer β€” Transcript summarization with action items
  • Streaming Responses β€” Real-time SSE token streaming for all AI endpoints

File Processing Pipeline

Upload β†’ Validate β†’ Store β†’ Queue Celery Task
  β†’ Extract Text (PDF/DOCX/TXT/Code)
  β†’ Chunk Text (configurable window + overlap)
  β†’ Generate Embeddings (OpenAI batch)
  β†’ Index to Qdrant
  β†’ Update File Status β†’ Done

Vector Search (RAG)

  • Semantic similarity search via Qdrant cosine distance
  • Per-user + per-file metadata filtering
  • Configurable top-k retrieval with score threshold
  • Context injection into structured prompts

Subscription System

  • Free / Pro / Enterprise tiers
  • Per-month request and token quotas
  • Quota enforcement via dependency injection
  • Stripe-ready schema (customer_id, subscription_id columns)

Background Processing (Celery)

  • embeddings queue β€” document processing & indexing
  • indexing queue β€” vector operations
  • cleanup queue β€” expired token purge, orphaned file cleanup

Project Structure

app/
β”œβ”€β”€ api/v1/          # Thin route handlers
β”‚   β”œβ”€β”€ auth.py
β”‚   β”œβ”€β”€ ai.py
β”‚   β”œβ”€β”€ files.py
β”‚   β”œβ”€β”€ stream.py
β”‚   └── subscriptions.py
β”œβ”€β”€ ai/              # OpenAI integration & RAG pipeline
β”‚   β”œβ”€β”€ client.py
β”‚   β”œβ”€β”€ completions.py
β”‚   β”œβ”€β”€ embeddings.py
β”‚   └── pipeline.py
β”œβ”€β”€ core/            # Security, exceptions, logging
β”œβ”€β”€ db/              # SQLAlchemy engine & session
β”œβ”€β”€ models/          # ORM models (7 tables)
β”œβ”€β”€ schemas/         # Pydantic v2 request/response schemas
β”œβ”€β”€ repositories/    # Data access layer (no business logic)
β”œβ”€β”€ services/        # Business logic layer
β”œβ”€β”€ tasks/           # Celery workers
β”œβ”€β”€ vector/          # Qdrant client, indexer, retriever
β”œβ”€β”€ streaming/       # SSE helpers
β”œβ”€β”€ middleware/       # Request logging, rate limiting
β”œβ”€β”€ dependencies/    # FastAPI DI (auth, db, quota)
β”œβ”€β”€ utils/           # File extraction, text chunking
└── tests/           # pytest async test suite

Quick Start

Using Docker (recommended)

cp .env.example .env
# Edit .env with your OPENAI_API_KEY and SECRET_KEY

docker compose up --build

The API will be available at http://localhost:8000.

Local Development

python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# Fill in DATABASE_URL, REDIS_URL, OPENAI_API_KEY, SECRET_KEY

alembic upgrade head
uvicorn app.main:app --reload

API Reference

Authentication

POST /api/v1/auth/register    Register new user
POST /api/v1/auth/login       Obtain access + refresh tokens
POST /api/v1/auth/refresh     Rotate refresh token

Files

POST   /api/v1/files/upload   Upload document (PDF/DOCX/TXT/code)
GET    /api/v1/files          List user's files
DELETE /api/v1/files/{id}     Delete file + vectors

AI Endpoints

POST /api/v1/ai/chat              General AI chat
POST /api/v1/ai/document-chat     RAG chat over uploaded document
POST /api/v1/ai/resume-analyze    Resume analysis
POST /api/v1/ai/code-review       Code review
POST /api/v1/ai/meeting-summary   Meeting transcript summarizer

Streaming (SSE)

POST /api/v1/stream/chat              Streaming general chat
POST /api/v1/stream/document-chat     Streaming RAG document chat

SSE events: token, done, error

Subscriptions

GET  /api/v1/subscriptions/plans    Available plans
GET  /api/v1/subscriptions/me       Current subscription
POST /api/v1/subscriptions/upgrade  Upgrade tier
GET  /api/v1/subscriptions/usage    Current period usage

Health

GET /health    Service health check

Environment Variables

Variable Description Default
DATABASE_URL PostgreSQL async URL required
REDIS_URL Redis URL required
OPENAI_API_KEY OpenAI secret key required
SECRET_KEY JWT signing secret required
QDRANT_URL Qdrant HTTP URL http://localhost:6333
OPENAI_MODEL Chat model gpt-4o
OPENAI_EMBEDDING_MODEL Embedding model text-embedding-3-small
ACCESS_TOKEN_EXPIRE_MINUTES Access token TTL 30
REFRESH_TOKEN_EXPIRE_DAYS Refresh token TTL 7
MAX_FILE_SIZE Max upload bytes 10485760 (10MB)
CHUNK_SIZE Embedding chunk word count 512
CHUNK_OVERLAP Chunk overlap words 50

Running Tests

# Requires a test PostgreSQL database: ai_saas_test
pytest -v

Database Migrations

# Generate migration after model changes
alembic revision --autogenerate -m "description"

# Apply migrations
alembic upgrade head

# Roll back one
alembic downgrade -1

Streaming Example (JavaScript)

const response = await fetch('/api/v1/stream/chat', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${token}`
  },
  body: JSON.stringify({ message: 'Explain async/await in Python' })
});

const reader = response.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  const lines = decoder.decode(value).split('\n');
  for (const line of lines) {
    if (line.startsWith('data:')) {
      const data = JSON.parse(line.slice(5));
      if (data.token) process.stdout.write(data.token);
    }
  }
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages