Skip to content

comakingspace/ScannerPi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pi-Scanner

A Raspberry Pi-based automated document scanning solution that uploads scanned documents directly to Nextcloud. This project provides an Ansible playbook to automatically configure a Raspberry Pi as a network-connected scanner with automatic upload capabilities.

Features

  • Automatic document scanning with SANE
  • OCR processing using Tesseract
  • Automatic upload to Nextcloud
  • Runs as system services for reliability
  • Minimal configuration needed after setup

Prerequisites

  • Raspberry Pi (any model) with Raspberry OS installed
  • Scanner compatible with SANE (check SANE supported devices)
  • Nextcloud instance with write access to a specific folder
  • Ansible installed on your control machine
  • SSH access to the Raspberry Pi

Quick Start

  1. Clone this repository:

    git clone https://github.com/yourusername/Pi-Scanner.git
    cd Pi-Scanner
  2. Set up SSH key access to your Raspberry Pi:

    ssh-copy-id pi@<your-pi-ip-address>
  3. Create your configuration file and fill in your values:

    cp Ansible/config.example.yml Ansible/config.yml
    $EDITOR Ansible/config.yml

    config.yml holds:

    • cloud_url — your Nextcloud server base URL (see "Finding your share token" below)
    • cloud_user — the share token, not a username (see below)
    • cloud_pass — the password set on the share
    • user, datafolder — optional overrides (default to pi and /home/pi/scan-data)

    It is gitignored because it contains credentials in plain text.

    Finding your share token

    Scans are uploaded to a public File Drop share rather than a personal account, so the "user" is actually the share token taken from the share link.

    1. In Nextcloud, create a folder and share it as a link with "Allow upload and editing" (a File Drop / upload-only share is fine).

    2. Set a password on the share.

    3. Copy the share link — it looks like:

      https://yourserver.de/s/HWyoGEkKRwBY2xK
      └──────── cloud_url ────────┘   └── token ──┘
      
    4. Fill config.yml accordingly:

      Field Value from the example link
      cloud_url https://yourserver.de (before /s/)
      cloud_user HWyoGEkKRwBY2xK (the token after /s/)
      cloud_pass the password you set on the share

    Uploads use the public WebDAV endpoint <cloud_url>/public.php/dav/files/<token>/<filename>.

  4. Run the Ansible playbook:

    ansible-playbook -i <your-pi-ip-address>, -u pi Ansible/playbook_setup_scanner-Pi.yml

    To use a config file in another location:

    ansible-playbook -i <your-pi-ip-address>, -u pi \
      -e config_file=/path/to/config.yml Ansible/playbook_setup_scanner-Pi.yml

What to Expect After Setup

  1. The Raspberry Pi will be configured with the hostname "ScannerPi"

  2. Two services will be running:

    • scand: Monitors for new documents and handles scanning
    • uploadd: Handles uploading scanned documents to Nextcloud
  3. Scanned documents will be:

    • Automatically processed for better quality
    • Converted to searchable PDFs (with OCR)
    • Uploaded to your specified Nextcloud folder

Troubleshooting

  1. Check service status:

    sudo systemctl status scand
    sudo systemctl status uploadd
  2. View logs:

    journalctl -u scand
    journalctl -u uploadd
  3. Common issues:

    • If scanning fails, ensure your scanner is properly connected and recognized by SANE
    • If uploads fail, verify your Nextcloud credentials and connectivity
    • Check permissions if files aren't being created or uploaded

Canon imageFORMULA scanner not detected (AutoStart / mass-storage mode)

Canon imageFORMULA scanners (e.g. P-208II) ship with an AutoStart / CaptureOnTouch feature that, when enabled, makes the scanner boot as a USB Mass Storage device (a virtual installer CD) instead of a scanner. In that state lsusb shows the device but scanimage -L finds nothing.

Diagnose by checking the USB product ID:

lsusb | grep -i canon
  • 1083:1660 → AutoStart on, presenting as mass storage — SANE cannot use it
  • 1083:165f → AutoStart off, presenting as a scanner — works with the canon_dr backend

If you see the mass-storage ID, turn AutoStart off (it is a setting stored in the scanner's firmware, toggled with Canon's CaptureOnTouch utility on Windows). The scanner then re-enumerates as a normal scanner. This is a per-device firmware setting and cannot be changed from the Pi.

Security Note

The Nextcloud credentials are stored unencrypted on the Raspberry Pi. This is considered acceptable as:

  • The credentials only have access to a specific upload directory
  • The Raspberry Pi should be physically secured and on a trusted network
  • The credentials cannot be used to access other parts of your Nextcloud instance

Technical Details

Scanning Service (scand)

The scanning service runs continuously and handles document scanning and processing:

  • Uses scanadf for ADF (Automatic Document Feeder) scanning in duplex mode
  • Scans at 300 DPI in grayscale
  • Performs automatic deskewing (both software and roller-based)
  • Processing pipeline:
    1. Scans pages to PNG format
    2. Compresses to JPG using ImageMagick (85% quality, grayscale)
    3. Combines all pages into a single PDF using img2pdf
    4. Cleans up temporary PNG/JPG files
  • Implements systemd watchdog for service health monitoring

Upload Service (uploadd)

A separate service handles the upload process to Nextcloud:

  • Monitors the data folder for new PDF files
  • Uploads using Nextcloud's WebDAV interface
  • Automatically removes successfully uploaded files
  • Runs checks every 60 seconds
  • Uses systemd watchdog for service health monitoring

File Locations

  • Scanned documents: {{ datafolder }} (default: /home/pi/scan-data)
  • Service scripts: /home/pi/scand.sh and /home/pi/uploadd.sh
  • Service definitions: /etc/systemd/system/scand.service and /etc/systemd/system/uploadd.service

Customization

Advanced users can modify the scanning parameters by editing /home/pi/scand.sh:

  • Resolution (default: 300 DPI)
  • Page height limit (default: 500)
  • Image compression quality (default: 85%)
  • Scanner-specific options via scanadf

Contributing

Feel free to open issues or submit pull requests if you have suggestions for improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Software for our Pi that scans stuff to the [CLOUD]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages