🌩️ Baadal — Server Log Collector and Alerter

A lightweight, self-contained observability agent for monitoring Ubuntu servers with disk usage tracking, kernel monitoring, Docker container health, and webhook-based alerting.

✨ Features

📊 Disk Usage Monitoring — Track largest directories with configurable thresholds
🐳 Docker Log Monitoring — Filter container logs for errors, panics, OOM events
🔍 Kernel Monitoring — Monitor dmesg for critical kernel events
💓 Heartbeat — Periodic alive signals for health tracking
🔄 Event Deduplication — Reduce noise with smart event hashing
📈 Performance Stats — Track collector performance metrics
🔔 Webhook Alerts — Discord, Slack, and custom API notifications
🔒 Dead Man's Switch — Detect silent failures with receiver-side monitoring
📝 Log Rotation — Automatic log rotation with lumberjack
🔄 Hot Reload — Update config without restart (SIGHUP)
🏗️ Dual Mode — Run as collector (agent) or receiver (central server)

🚀 Quick Start

Download Pre-built Binary

Download the latest release for your platform from GitHub Releases:

# Linux amd64
wget https://github.com/YOUR_USERNAME/baadhal/releases/latest/download/baadal-linux-amd64.tar.gz
tar -xzf baadal-linux-amd64.tar.gz
chmod +x baadal-linux-amd64

# Verify checksum
sha256sum -c baadal-linux-amd64.tar.gz.sha256

Build from Source

git clone https://github.com/YOUR_USERNAME/baadhal.git
cd baadhal
go mod download
go build -o baadal main.go

📋 Detailed Configuration

Baadal uses a YAML configuration file (config.yml) with comprehensive options for each monitoring component.

1. App Configuration

app:
  name: "baadal"              # Application name (used in logs)
  enabled: true                # Master switch - set to false to disable
  log_level: "info"           # Logging level: debug | info | warn | error
  
  heartbeat:
    enabled: true              # Enable periodic heartbeat events
    interval: "*/5 * * * *"   # Cron expression for heartbeat frequency
                               # Examples:
                               #   "*/5 * * * *"  - Every 5 minutes
                               #   "*/10 * * * *" - Every 10 minutes
                               #   "0 * * * *"    - Every hour
  
  deduplication:
    enabled: true              # Enable event deduplication
    window_seconds: 60         # Time window to check for duplicates
                               # Events with identical type+host+data 
                               # within this window are discarded

Heartbeat Details:

Sends periodic "alive" signals to receiver
Includes uptime in seconds
Helps detect collector crashes or network issues
Recommended: 5-10 minutes for production

Deduplication Details:

Uses MD5 hash of (type + host + data) for comparison
Prevents alert spam from repeated errors
Memory-efficient with automatic cleanup every 5 minutes
Recommended: 60-300 seconds depending on alert frequency

2. Transport Configuration

transport:
  mode: "remote"              # "remote" or "local"
  
  remote:                     # Used when mode = "remote"
    endpoint: "http://receiver-ip:5170/ingest"  # Receiver URL
                               # Use Tailscale/VPN IP for security
    
    auth:
      enabled: false           # Enable bearer token authentication
      token: "your-secret-token"  # Must match receiver token
    
    batch_size: 10            # Send after N events accumulate
    flush_interval: "5s"      # Or send after this time (whichever first)
    retry_attempts: 3         # Number of retries on failure
    retry_delay: "2s"         # Delay between retry attempts
  
  local:                      # Used when mode = "local"
    log_output: "/var/log/baadal/events.log"  # Local file path

Mode Selection:

remote: Recommended for production - sends to central receiver
local: For testing or standalone logging

Remote Transport Tips:

Use batch_size: 1 and flush_interval: "1s" for real-time alerting
Use batch_size: 50 and flush_interval: "30s" to reduce network traffic
Enable auth.enabled: true for production deployments

Endpoint Examples:

endpoint: "http://192.168.1.100:5170/ingest"      # Local network
endpoint: "http://100.x.x.x:5170/ingest"          # Tailscale
endpoint: "https://monitor.example.com/ingest"    # Public (use HTTPS + auth!)

3. Disk Monitoring

disk:
  enabled: true               # Enable disk usage monitoring
  paths:                      # Directories to scan
    - /                       # Root filesystem
    - /var                    # Common log location
    - /var/lib/docker         # Docker data
    - /home                   # User directories
    - /tmp                    # Temporary files
    - /opt                    # Optional software
  
  top_n: 5                    # Report top N largest directories
  max_depth: 3                # Maximum directory depth to scan
                               # Higher = more detail but slower
  
  schedule: "*/5 * * * *"     # Cron schedule for scans
                               # Examples:
                               #   "*/15 * * * *"  - Every 15 minutes
                               #   "0 */2 * * *"   - Every 2 hours
                               #   "0 3 * * *"     - Daily at 3 AM
  
  alert:
    enabled: true              # Enable alerting for disk usage
    condition: "size_gb > 20"  # Alert when directory exceeds 20 GB
                               # Can adjust threshold as needed
    webhooks:                  # Webhook names to fire (from webhooks section)
      - discord
      - custom-api

Path Selection Tips:

Start with root paths (/, /var, /home)
Add Docker path if running containers: /var/lib/docker
Add application-specific paths: /opt/myapp, /data
Avoid network mounts (slow scans)

Performance Tuning:

# Fast scan (less detail, every 30 min)
max_depth: 2
schedule: "*/30 * * * *"

# Detailed scan (more detail, hourly)
max_depth: 4
schedule: "0 * * * *"

# Daily deep scan
max_depth: 5
schedule: "0 2 * * *"  # 2 AM daily

Alert Threshold Examples:

condition: "size_gb > 10"    # Alert at 10 GB
condition: "size_gb > 50"    # Alert at 50 GB
condition: "size_gb > 100"   # Alert at 100 GB

4. Dmesg/Kernel Monitoring

dmesg:
  enabled: true               # Enable kernel message monitoring
  schedule: "*/1 * * * *"     # Check every minute (recommended)
  
  filter_levels:              # Kernel log levels to monitor
    - err                     # Error conditions
    - crit                    # Critical conditions
    - alert                   # Action must be taken immediately
    - emerg                   # System is unusable
  
  # Available but not recommended for production:
  #   - warn     # Warning conditions (very noisy)
  #   - notice   # Normal but significant
  #   - info     # Informational
  #   - debug    # Debug-level messages
  
  alert:
    enabled: true              # Enable alerting
    condition: "level == crit" # Alert on critical messages
                               # Options: emerg, alert, crit, err
    webhooks:
      - discord

Filter Level Guide:

Level	Severity	Typical Issues	Recommended
`emerg`	Emergency	Kernel panic, system crash	✅ Yes
`alert`	Alert	Hardware failure, critical bug	✅ Yes
`crit`	Critical	Hard disk errors, memory issues	✅ Yes
`err`	Error	Driver errors, non-critical failures	✅ Yes
`warn`	Warning	Deprecation notices, soft errors	⚠️ Noisy
`notice`	Notice	Normal but significant	❌ Too noisy
`info`	Info	General information	❌ Too noisy
`debug`	Debug	Debug messages	❌ Development only

Alert Condition Examples:

condition: "level == crit"   # Only critical errors
condition: "level == alert"  # Alert-level and above
condition: "level == emerg"  # Only emergency (kernel panic)

Best Practices:

Run every 1-2 minutes for timely kernel error detection
Filter to err, crit, alert, emerg only
Don't include warn - generates too many false positives

5. Docker Monitoring

docker:
  enabled: true               # Enable Docker log monitoring
  containers:                 # Which containers to monitor
    - "all"                   # Monitor all running containers
    # OR specify by name:
    # - "nginx"
    # - "postgres"
    # - "redis"
  
  tail_lines: 100             # Number of recent log lines to check
                               # Higher = more coverage but slower
  
  schedule: "*/2 * * * *"     # Check every 2 minutes
  
  filter_keywords:            # Keywords to search for in logs
    - "error"                 # Generic errors
    - "fatal"                 # Fatal errors
    - "panic"                 # Go panic, Python panic
    - "OOM"                   # Out of memory
    - "killed"                # Process killed
    - "segfault"              # Segmentation fault
    - "exception"             # Exceptions (add if using Python/Java)
  
  alert:
    enabled: true
    condition: "keyword == fatal"  # Alert on "fatal" keyword matches
    webhooks:
      - discord
      - custom-api

Container Selection:

# Monitor all containers
containers: ["all"]

# Monitor specific containers
containers:
  - "nginx"
  - "postgres-primary"
  - "redis"

# Monitor by pattern (use "all" and filter in alerts)
containers: ["all"]
filter_keywords: ["error", "fatal"]

Keyword Selection by Stack:

Node.js/JavaScript:

filter_keywords:
  - "error"
  - "fatal"
  - "uncaughtException"
  - "unhandledRejection"
  - "ECONNREFUSED"
  - "ETIMEDOUT"

Python:

filter_keywords:
  - "error"
  - "fatal"
  - "exception"
  - "traceback"
  - "critical"

Java/Spring:

filter_keywords:
  - "error"
  - "exception"
  - "OutOfMemoryError"
  - "StackOverflowError"
  - "SQLException"

Go:

filter_keywords:
  - "error"
  - "fatal"
  - "panic"
  - "deadlock"

Database (Postgres/MySQL):

filter_keywords:
  - "error"
  - "fatal"
  - "panic"
  - "deadlock"
  - "connection refused"

Alert Condition Examples:

condition: "keyword == fatal"     # Only fatal errors
condition: "keyword == panic"     # Only panics
condition: "keyword == OOM"       # Only out of memory

Performance Tips:

# Frequent checks (every minute)
tail_lines: 50
schedule: "*/1 * * * *"

# Balanced (every 2 minutes)
tail_lines: 100
schedule: "*/2 * * * *"

# Less frequent but thorough (every 5 minutes)
tail_lines: 500
schedule: "*/5 * * * *"

6. Webhooks

Baadal supports multiple webhook destinations for alerts. Each webhook can have custom templates.

Discord Webhook

webhooks:
  - name: "discord"
    enabled: true
    url: "https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/YOUR_WEBHOOK_TOKEN"
    method: POST
    headers:
      Content-Type: "application/json"
    payload_template: |
      {
        "content": "🚨 {{.Title}}",
        "embeds": [{
          "description": "{{.Message}}",
          "fields": [
            {"name": "Host",     "value": "{{.Hostname}}",  "inline": true},
            {"name": "Type",     "value": "{{.Type}}",      "inline": true},
            {"name": "Severity", "value": "{{.Severity}}",  "inline": true},
            {"name": "Time",     "value": "{{.Timestamp}}", "inline": true}
          ],
          "color": 16711680
        }]
      }

How to Get Discord Webhook URL:

Open Discord server → Server Settings → Integrations
Click "Webhooks" → "New Webhook"
Choose channel, copy webhook URL
Paste in config.yml

Discord Color Codes:

"color": 16711680   # Red (critical)
"color": 16776960   # Yellow (warning)
"color": 65280      # Green (info)
"color": 3447003    # Blue (info)

Slack Webhook

  - name: "slack"
    enabled: true
    url: "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
    method: POST
    headers:
      Content-Type: "application/json"
    payload_template: |
      {
        "text": "🚨 *{{.Title}}*",
        "blocks": [
          {
            "type": "section",
            "text": {
              "type": "mrkdwn",
              "text": "*{{.Title}}*\n{{.Message}}"
            }
          },
          {
            "type": "section",
            "fields": [
              {"type": "mrkdwn", "text": "*Host:*\n{{.Hostname}}"},
              {"type": "mrkdwn", "text": "*Type:*\n{{.Type}}"},
              {"type": "mrkdwn", "text": "*Severity:*\n{{.Severity}}"},
              {"type": "mrkdwn", "text": "*Time:*\n{{.Timestamp}}"}
            ]
          }
        ]
      }

How to Get Slack Webhook URL:

Visit https://api.slack.com/apps
Create New App → "From scratch"
Enable "Incoming Webhooks"
Add New Webhook to Workspace
Copy webhook URL

Custom API Webhook

  - name: "custom-api"
    enabled: true
    url: "https://your-api.com/alerts"
    method: POST
    headers:
      Authorization: "Bearer YOUR_API_TOKEN"
      Content-Type: "application/json"
      X-Custom-Header: "baadal-alerts"
    payload_template: |
      {
        "alert": "{{.Title}}",
        "host": "{{.Hostname}}",
        "timestamp": "{{.Timestamp}}",
        "type": "{{.Type}}",
        "severity": "{{.Severity}}",
        "message": "{{.Message}}",
        "data": {{.Data}}
      }

PagerDuty Webhook

  - name: "pagerduty"
    enabled: true
    url: "https://events.pagerduty.com/v2/enqueue"
    method: POST
    headers:
      Content-Type: "application/json"
    payload_template: |
      {
        "routing_key": "YOUR_INTEGRATION_KEY",
        "event_action": "trigger",
        "payload": {
          "summary": "{{.Title}}",
          "severity": "critical",
          "source": "{{.Hostname}}",
          "custom_details": {
            "message": "{{.Message}}",
            "type": "{{.Type}}",
            "timestamp": "{{.Timestamp}}"
          }
        }
      }

Email via SendGrid/Mailgun

  - name: "email"
    enabled: true
    url: "https://api.sendgrid.com/v3/mail/send"
    method: POST
    headers:
      Authorization: "Bearer YOUR_SENDGRID_API_KEY"
      Content-Type: "application/json"
    payload_template: |
      {
        "personalizations": [{
          "to": [{"email": "alerts@example.com"}]
        }],
        "from": {"email": "baadal@example.com"},
        "subject": "{{.Title}}",
        "content": [{
          "type": "text/plain",
          "value": "{{.Message}}\n\nHost: {{.Hostname}}\nTime: {{.Timestamp}}"
        }]
      }

Template Variables:

{{.Title}} - Alert title
{{.Message}} - Alert message
{{.Hostname}} - Source hostname
{{.Host}} - Same as Hostname
{{.Timestamp}} - ISO8601 timestamp (IST timezone)
{{.Type}} - Event type (disk_usage, dmesg, docker_log, etc.)
{{.Severity}} - Alert severity (critical, warning, info)
{{.Data}} - Raw JSON event data

7. Receiver Configuration

The receiver runs on your central monitoring server and accepts events from all collectors.

receiver:
  enabled: true               # Enable receiver mode
  port: 5170                  # Port to listen on
  
  auth:
    enabled: false            # Enable authentication
    token: "your-secret-token"  # Must match collector tokens
                               # IMPORTANT: Enable in production!
  
  log_output: "/var/log/baadal/events.log"  # Where to write events
  
  log_rotation:               # Automatic log rotation (via lumberjack)
    max_size_mb: 50           # Rotate after 50 MB
    max_backups: 3            # Keep 3 old log files
    max_age_days: 7           # Delete logs older than 7 days
    compress: true            # Gzip old log files
  
  dead_mans_switch:           # Detect missing collectors
    enabled: true
    timeout_minutes: 10       # Alert if no events for 10 minutes
    check_interval: "*/2 * * * *"  # Check every 2 minutes
    webhooks:
      - discord
  
  promtail:                   # Optional: Forward to Loki
    enabled: false
    endpoint: "http://localhost:9080/loki/api/v1/push"

Log Rotation Examples:

High-frequency monitoring (lots of events):

log_rotation:
  max_size_mb: 100      # Larger files
  max_backups: 7        # Keep more history
  max_age_days: 14      # 2 weeks retention
  compress: true

Low-frequency monitoring:

log_rotation:
  max_size_mb: 20       # Smaller files
  max_backups: 3        # Less history
  max_age_days: 7       # 1 week retention
  compress: true

Dead Man's Switch:

Monitors when each collector last sent events
Fires webhook alert if collector goes silent
Helps detect crashed collectors or network issues

Dead Man's Switch Examples:

# Tight monitoring (5 min timeout)
timeout_minutes: 5
check_interval: "*/1 * * * *"

# Relaxed monitoring (30 min timeout)
timeout_minutes: 30
check_interval: "*/10 * * * *"

# Daily check (for non-critical servers)
timeout_minutes: 1440  # 24 hours
check_interval: "0 * * * *"  # Hourly check

Authentication:

# Development (no auth)
auth:
  enabled: false

# Production (required!)
auth:
  enabled: true
  token: "use-a-long-random-string-here"
  # Generate token: openssl rand -base64 32

8. Node Identity

Configure how this server identifies itself in events:

node:
  hostname: ""                # Empty = auto-detect from OS
                               # Or set manually: "web-server-01"
  
  environment: "production"   # Environment label
                               # Options: production, staging, dev, test
  
  tags:                       # Custom tags for filtering/grouping
    - "ubuntu"
    - "backend"
    - "api-server"
    - "us-east-1"

Hostname Examples:

hostname: ""                  # Auto-detect (recommended)
hostname: "web-server-01"     # Manual override
hostname: "db-primary"        # For databases
hostname: "worker-03"         # For worker nodes

Environment Best Practices:

environment: "production"     # Live production servers
environment: "staging"        # Staging/QA environment
environment: "development"    # Dev servers
environment: "test"           # CI/CD test runners

Tag Examples:

By Role:

tags: ["web-server", "nginx", "frontend"]
tags: ["database", "postgres", "primary"]
tags: ["worker", "celery", "background-jobs"]

By Location:

tags: ["aws", "us-east-1", "production"]
tags: ["on-premise", "datacenter-1"]
tags: ["cloud", "digitalocean", "sgp1"]

By Stack:

tags: ["nodejs", "express", "api"]
tags: ["python", "django", "web"]
tags: ["go", "microservice"]

🔧 Complete Configuration Example

Here's a production-ready config.yml with all features enabled:

# ─────────────────────────────────────────────
#  Baadal — Production Configuration
# ─────────────────────────────────────────────

app:
  name: "baadal"
  enabled: true
  log_level: "info"
  
  heartbeat:
    enabled: true
    interval: "*/5 * * * *"
  
  deduplication:
    enabled: true
    window_seconds: 60

transport:
  mode: "remote"
  
  remote:
    endpoint: "http://100.64.1.100:5170/ingest"  # Replace with your receiver IP
    auth:
      enabled: true
      token: "your-generated-secret-token-here"  # Generate: openssl rand -base64 32
    batch_size: 10
    flush_interval: "5s"
    retry_attempts: 3
    retry_delay: "2s"

disk:
  enabled: true
  paths:
    - /
    - /var
    - /var/lib/docker
    - /home
    - /opt
  top_n: 5
  max_depth: 3
  schedule: "*/10 * * * *"
  alert:
    enabled: true
    condition: "size_gb > 50"
    webhooks:
      - discord

dmesg:
  enabled: true
  schedule: "*/1 * * * *"
  filter_levels:
    - err
    - crit
    - alert
    - emerg
  alert:
    enabled: true
    condition: "level == crit"
    webhooks:
      - discord

docker:
  enabled: true
  containers:
    - "all"
  tail_lines: 100
  schedule: "*/2 * * * *"
  filter_keywords:
    - "error"
    - "fatal"
    - "panic"
    - "OOM"
    - "killed"
    - "segfault"
  alert:
    enabled: true
    condition: "keyword == fatal"
    webhooks:
      - discord

webhooks:
  - name: "discord"
    enabled: true
    url: "https://discord.com/api/webhooks/YOUR_WEBHOOK_ID/YOUR_TOKEN"
    method: POST
    headers:
      Content-Type: "application/json"
    payload_template: |
      {
        "content": "🚨 {{.Title}}",
        "embeds": [{
          "description": "{{.Message}}",
          "fields": [
            {"name": "Host",     "value": "{{.Hostname}}",  "inline": true},
            {"name": "Type",     "value": "{{.Type}}",      "inline": true},
            {"name": "Severity", "value": "{{.Severity}}",  "inline": true},
            {"name": "Time",     "value": "{{.Timestamp}}", "inline": true}
          ],
          "color": 16711680
        }]
      }

receiver:
  enabled: false  # Set to true only on receiver server
  port: 5170
  auth:
    enabled: true
    token: "your-generated-secret-token-here"  # Must match collector token
  log_output: "/var/log/baadal/events.log"
  log_rotation:
    max_size_mb: 50
    max_backups: 3
    max_age_days: 7
    compress: true
  dead_mans_switch:
    enabled: true
    timeout_minutes: 10
    check_interval: "*/2 * * * *"
    webhooks:
      - discord

node:
  hostname: ""  # Auto-detect
  environment: "production"
  tags:
    - "ubuntu"
    - "web-server"
    - "backend"

🏃 Usage

Collector Mode (on monitored servers)

# Run directly
./baadal --mode=collector --config=config.yml

# Run in background
nohup ./baadal --mode=collector --config=config.yml > /dev/null 2>&1 &

# Check it's running
ps aux | grep baadal

Receiver Mode (on central monitoring server)

# Run directly
./baadal --mode=receiver --config=config.yml

# Run in background
nohup ./baadal --mode=receiver --config=config.yml > /dev/null 2>&1 &

Systemd Installation (Recommended)

# On collector servers
sudo bash install.sh collector

# On receiver server
sudo bash install.sh receiver

# Service management
sudo systemctl start baadal
sudo systemctl enable baadal    # Start on boot
sudo systemctl status baadal
sudo journalctl -u baadal -f    # Follow logs

# Reload configuration without restart
sudo systemctl kill -s HUP baadal

# Restart service
sudo systemctl restart baadal

# Stop service
sudo systemctl stop baadal

Configuration Hot Reload

Baadal supports reloading configuration without restart:

# If running via systemd
sudo systemctl kill -s HUP baadal

# If running manually (get PID first)
ps aux | grep baadal
kill -HUP <PID>

What gets reloaded:

✅ Schedule intervals
✅ Alert thresholds
✅ Webhook configurations
✅ Filter keywords
✅ Paths to monitor
❌ Mode (collector/receiver) - requires restart

🐳 Docker

Using Pre-built Image

# Pull from GitHub Container Registry
docker pull ghcr.io/YOUR_USERNAME/baadhal:latest

# Run collector
docker run -d \
  --name baadal-collector \
  -v /var/run/docker.sock:/var/run/docker.sock:ro \
  -v $(pwd)/config.yml:/app/config.yml:ro \
  -e MODE=collector \
  --restart unless-stopped \
  ghcr.io/YOUR_USERNAME/baadhal:latest

# Run receiver
docker run -d \
  --name baadal-receiver \
  -p 5170:5170 \
  -v $(pwd)/config.yml:/app/config.yml:ro \
  -v baadal-logs:/var/log/baadal \
  -e MODE=receiver \
  --restart unless-stopped \
  ghcr.io/YOUR_USERNAME/baadhal:latest

# View logs
docker logs -f baadal-collector
docker logs -f baadal-receiver

# Reload config
docker kill -s HUP baadal-collector

Docker Compose

version: '3.8'

services:
  # Receiver (central monitoring server)
  baadal-receiver:
    image: ghcr.io/YOUR_USERNAME/baadhal:latest
    container_name: baadal-receiver
    ports:
      - "5170:5170"
    volumes:
      - ./config.yml:/app/config.yml:ro
      - baadal-logs:/var/log/baadal
    environment:
      - MODE=receiver
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:5170/health"]
      interval: 30s
      timeout: 3s
      retries: 3

  # Collector (on same server or different server)
  baadal-collector:
    image: ghcr.io/YOUR_USERNAME/baadhal:latest
    container_name: baadal-collector
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./config.yml:/app/config.yml:ro
    environment:
      - MODE=collector
    restart: unless-stopped

volumes:
  baadal-logs:

📊 Event Types

Type	Description	Data Fields	Triggers
`disk_usage`	Directory size monitoring	`top_dirs`, `scan_root`, `total_scanned_gb`	When scanned
`dmesg`	Kernel message monitoring	`level`, `message`, `kernel_ts`	On new kernel messages
`docker_log`	Container log monitoring	`container`, `line`, `matched_keyword`	On keyword match
`heartbeat`	Periodic health signal	`status`, `uptime_seconds`	On schedule
`lifecycle`	Start/stop events	`event`, `mode`, `version`, `uptime_seconds`	On start/stop
`collector_stats`	Performance metrics	`disk_scan_ms`, `events_sent`, `events_deduped`, `cycle_total_ms`	Every 5 min
`dead_mans_switch`	Missing host alert	N/A (webhook only)	Receiver detects silence

Example Event JSON

Disk Usage Event:

{
  "timestamp": "2026-03-06T18:45:00+05:30",
  "type": "disk_usage",
  "host": "web-server-01",
  "environment": "production",
  "tags": ["ubuntu", "backend"],
  "data": {
    "top_dirs": [
      {
        "path": "/var/lib/docker",
        "size_gb": 45.3,
        "size_mb": 46387,
        "rank": 1
      }
    ],
    "scan_root": "/var",
    "total_scanned_gb": 67.8
  },
  "alert_triggered": true,
  "alert_condition": "size_gb > 20"
}

Docker Log Event:

{
  "timestamp": "2026-03-06T18:50:12+05:30",
  "type": "docker_log",
  "host": "api-server-02",
  "environment": "production",
  "tags": ["nodejs", "api"],
  "data": {
    "container": "api-backend",
    "line": "Fatal error: Cannot connect to database",
    "matched_keyword": "fatal"
  },
  "alert_triggered": true,
  "alert_condition": "keyword == fatal"
}

Heartbeat Event:

{
  "timestamp": "2026-03-06T18:55:00+05:30",
  "type": "heartbeat",
  "host": "worker-01",
  "environment": "production",
  "tags": ["worker", "celery"],
  "data": {
    "status": "alive",
    "uptime_seconds": 86400
  },
  "alert_triggered": false
}

🔧 Architecture

┌─────────────────────────────────────────────────────────────────┐
│                      Monitored Servers                          │
│                                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐         │
│  │ Collector #1 │  │ Collector #2 │  │ Collector #N │         │
│  │              │  │              │  │              │         │
│  │ - Disk scan  │  │ - Disk scan  │  │ - Disk scan  │         │
│  │ - Dmesg      │  │ - Dmesg      │  │ - Dmesg      │         │
│  │ - Docker     │  │ - Docker     │  │ - Docker     │         │
│  │ - Heartbeat  │  │ - Heartbeat  │  │ - Heartbeat  │         │
│  │ - Dedup      │  │ - Dedup      │  │ - Dedup      │         │
│  └───────┬──────┘  └───────┬──────┘  └───────┬──────┘         │
│          │                 │                  │                │
└──────────┼─────────────────┼──────────────────┼────────────────┘
           │                 │                  │
           │   Batched       │   Batched        │   Batched
           │   Events        │   Events         │   Events
           │   (HTTP/JSON)   │   (HTTP/JSON)    │   (HTTP/JSON)
           │                 │                  │
           └─────────────────┼──────────────────┘
                             │
                             ▼
              ┌──────────────────────────────┐
              │    Central Receiver Server   │
              │                              │
              │  - HTTP endpoint (port 5170) │
              │  - Authentication            │
              │  - Log rotation (lumberjack) │
              │  - Dead man's switch         │
              │  - Webhook dispatcher        │
              └───────────┬──────────────────┘
                          │
                          ├─────────────┬──────────────┐
                          ▼             ▼              ▼
                   ┌──────────┐  ┌──────────┐  ┌──────────┐
                   │ Discord  │  │  Slack   │  │ Custom   │
                   │ Webhook  │  │ Webhook  │  │   API    │
                   └──────────┘  └──────────┘  └──────────┘

Data Flow

Collection: Collectors run scheduled jobs (cron)
Deduplication: Events checked against recent history
Batching: Events accumulated until batch_size or flush_interval
Transport: HTTP POST to receiver with optional auth
Logging: Receiver writes to rotated log file
Alerting: Matching conditions trigger webhooks
Dead Man's Switch: Receiver monitors for missing collectors

Performance Characteristics

Collector (per server):

CPU: < 1% average
Memory: ~15-20 MB
Disk I/O: Minimal (reads only during scans)
Network: < 1 KB/min average

Receiver:

CPU: < 1% average
Memory: ~20-30 MB + log buffer
Disk I/O: Sequential writes only
Network: Depends on number of collectors

🛠️ Development

Requirements

Go 1.24+
Docker (for container log monitoring)

Run Tests

go test ./...
go vet ./...
gofmt -s -l .

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/amazing)
Commit your changes (git commit -am 'Add amazing feature')
Push to the branch (git push origin feature/amazing)
Open a Pull Request

📝 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Built with:

cron — Cron scheduler
lumberjack — Log rotation
Docker SDK — Container monitoring
yaml.v3 — YAML parsing

📞 Support

Made with ☁️ by the Baadal Team

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
RELEASE.md		RELEASE.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
baadhal.zip		baadhal.zip
config.yml		config.yml
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh
main.go		main.go

Folders and files

Latest commit

History

Repository files navigation

🌩️ Baadal — Server Log Collector and Alerter

✨ Features

� Table of Contents

🚀 Quick Start

Download Pre-built Binary

Build from Source

📋 Detailed Configuration

1. App Configuration

2. Transport Configuration

3. Disk Monitoring

4. Dmesg/Kernel Monitoring

5. Docker Monitoring

6. Webhooks

Discord Webhook

Slack Webhook

Custom API Webhook

PagerDuty Webhook

Email via SendGrid/Mailgun

7. Receiver Configuration

8. Node Identity

🔧 Complete Configuration Example

🏃 Usage

Collector Mode (on monitored servers)

Receiver Mode (on central monitoring server)

Systemd Installation (Recommended)

Configuration Hot Reload

🐳 Docker

Using Pre-built Image

Docker Compose

📊 Event Types

Example Event JSON

🔧 Architecture

Data Flow

Performance Characteristics

🛠️ Development

Requirements

Run Tests

Contributing

📝 License

🙏 Acknowledgments

📞 Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages