# Albumen — Design Document ## Overview Albumen is a self-hosted photo and media album. The guiding principle is simplicity: the filesystem *is* the database. Every directory under `MEDIA_ROOT` is an album; every image, video, or audio file inside it is a media item. The only metadata that can't be derived from the filesystem is stored in a small `album.json` sidecar file that lives alongside the media in each directory. There is no import step, no database to migrate, no ORM. Drop files into a directory, run `update.rb`, and they are immediately browseable. --- ## System Architecture ``` Browser / mobile │ HTTPS ▼ Apache reverse proxy (192.168.10.1 / albumen.jots.org) │ HTTP, forwards X-Forwarded-Proto ▼ Puma application server (127.0.0.1:4567) │ Rack / Sinatra ▼ app.rb ──reads──► /var/albumen/ (media files + album.json + faces.json sidecars) ──reads──► /opt/albumen/cache/thumbs/ (generated thumbnails) ──reads──► /opt/albumen/config.yml (password hash + session secret) face_daemon.rb ──reads──► /var/albumen/ (image files) ──writes─► /var/albumen/**/faces.json (per-directory face sidecar) ``` ### Process model The web app runs as the `albumen` system user under systemd (`config/albumen.service`). Puma is configured for 1 worker with 4–8 threads (`config/puma.rb`). Logs go to `/opt/albumen/log/`. On crash, systemd restarts the process after 5 seconds. The face detection daemon (`config/face_daemon.service`) runs as a separate systemd service under the same `albumen` user. It polls `MEDIA_ROOT` every `poll_interval` seconds (default 300), processes any images not yet in a `faces.json` sidecar, and writes results atomically. It never touches `album.json`, so there is no write contention with `update.rb`. All process output is written to the shared log at `/opt/albumen/log/albumen.log`. ### Reverse proxy The proxy (Apache or nginx — a sample nginx config is in `config/nginx-albumen.conf`) terminates HTTPS and forwards plain HTTP to Puma on port 4567. It sets `X-Forwarded-Proto: https` so Sinatra's redirect helpers produce correct `https://` URLs and `Rack::Protection::HttpOrigin` sees the right scheme. `client_max_body_size` is set to unlimited because large video files may be uploaded via rsync. --- ## Directory Layout ``` /opt/albumen/ ← application root (code + config) app.rb config.ru Gemfile / Gemfile.lock config/ albumen.service ← systemd unit file for the web app face_daemon.service ← systemd unit file for the face detection daemon puma.rb ← Puma config nginx-albumen.conf ← sample reverse-proxy config views/ layout.erb ← site chrome (header, nav) album.erb ← browse page (grid + lightbox) slideshow.erb ← full-screen slideshow page admin/ login.erb album.erb ← per-album edit form public/ css/style.css js/album.js ← lightbox logic js/slideshow.js ← slideshow logic img/audio.svg ← placeholder thumbnail for audio files scripts/ update.rb ← post-upload scan/enrich script face_daemon.rb ← face detection daemon (polls for new images, writes faces.json) faces.py ← dlib CNN face detection helper called by the daemon set_password.rb ← PBKDF2-SHA256 password setter cache/thumbs/ ← generated thumbnail cache (mirrored path structure) tmp/ ← Puma pid / state files log/ ← Puma stdout / stderr logs config.yml ← runtime secrets (not in git) /var/albumen/ ← media root (owned by albumen user) album.json ← root-level sidecar (optional) SomeAlbum/ album.json ← per-album metadata sidecar (owned by update.rb) faces.json ← per-album face data sidecar (owned by face_daemon.rb) photo1.jpg photo2.jpg SubAlbum/ album.json faces.json photo3.jpg ``` --- ## Dependencies ### Ruby gems | Gem | Purpose | |-----|---------| | `sinatra ~> 4.0` | HTTP routing and ERB rendering | | `puma ~> 6.4` | Multi-threaded Rack application server | | `mini_magick ~> 4.12` | ImageMagick wrapper — thumbnail generation, EXIF-aware auto-orient | | `mini_exiftool ~> 2.10` | Reads EXIF `DateTimeOriginal` / `CreateDate` for chronological sorting | | `rack-session ~> 2.0` | Cookie-based session support (required by Sinatra 4 separately) | Password hashing uses `OpenSSL::PKCS5.pbkdf2_hmac` from Ruby's standard library (100,000 iterations, SHA-256, 32-byte output). No native gem extension is required. ### System tools | Tool | Purpose | |------|---------| | **ImageMagick** | Backing tool for MiniMagick — must be installed on the server | | **ExifTool** | Backing tool for MiniExiftool — must be installed on the server | | **ffmpeg** | Video thumbnail extraction (frame at 2 s) and duration probing via `ffprobe` | ### Optional: facial recognition | Component | Purpose | |-----------|---------| | **Python 3** | Runtime for `scripts/faces.py` | | **face_recognition** (PyPI) | dlib-backed face detection and 128-D embedding extraction | | `/opt/albumen/venv/` | Python virtual environment isolating the dependency | Install (one-time, takes ~30 min to compile dlib on CPU): ```bash apt install python3-pip python3-dev python3-venv cmake build-essential libopenblas-dev liblapack-dev python3 -m venv /opt/albumen/venv /opt/albumen/venv/bin/pip install face_recognition ``` Enable by adding to `config.yml`: ```yaml faces: enabled: true ``` When disabled (or when the venv doesn't exist), `update.rb` simply skips face detection and the rest of the app is unaffected. --- ## Data Model — `album.json` Each directory may contain an `album.json`. If none exists, defaults are used. The file is written atomically (write to a `.tmp` file, then `File.rename`) so a crash mid-write never corrupts existing data. ```json { "title": "Hawaii 2004", "description": "Optional free-text shown below the album title.", "cover": "dscn0929.jpg", "sort_reverse": false, "visible": true, "files": { "dscn0929.jpg": { "title": "Arrival at Kona", "caption": "Ken at the airport, exhausted.", "visible": true, "taken_at": "2004-06-15T14:23:00", "width": 2048, "height": 1536 } } } ``` **Top-level fields:** | Field | Default | Meaning | |-------|---------|---------| | `title` | directory name | Display name for the album | | `description` | `null` | Optional paragraph shown under the title | | `cover` | `null` (auto) | Filename of the cover image, or `"__random__"` | | `sort_reverse` | `false` | Reverses the order of sub-album cards | | `visible` | `true` | If `false`, hidden from non-admin visitors | **Per-file fields** (inside `"files"`): | Field | Default | Meaning | |-------|---------|---------| | `title` | filename | Display name used in the lightbox caption bar | | `caption` | `null` | Optional descriptive text shown under the title | | `visible` | `true` | If `false`, hidden from non-admin visitors | | `taken_at` | `null` | ISO 8601 timestamp from EXIF; used for chronological sorting | | `width` / `height` | `null` | Pixel dimensions recorded by `update.rb` | | `exif_absent` | `null` | Set to `true` by `update.rb` when exiftool found no metadata; skips re-extraction on future rescans | When `taken_at` is present on *any* file in an album, the entire album is sorted chronologically. Albums with no `taken_at` data stay in filename order. ### Cover image selection `album_cover(dir, data)` resolves what image to display as a sub-album's tile: 1. If `cover` is a specific filename and that file exists → use it. 2. If `cover` is `"__random__"` → pick a random file from `cover_candidates`. 3. Otherwise (no cover set, or file missing) → use the first file from `cover_candidates`. `cover_candidates(dir)` walks the entire directory tree recursively, so albums that contain only sub-albums (no top-level photos) still produce a cover. --- ## HTTP Routes ### Public | Method | Path | Action | |--------|------|--------| | `GET` | `/` | Redirect → `/browse/` | | `GET` | `/browse/` | Root album grid | | `GET` | `/browse/*` | Sub-album grid at that path | | `GET` | `/thumb/*` | Serve (or generate) a 300×300 JPEG thumbnail | | `GET` | `/media/*` | Serve the original media file | | `GET` | `/slideshow/` | Full-screen slideshow for the root | | `GET` | `/slideshow/*` | Full-screen slideshow for a sub-album | ### Admin | Method | Path | Action | |--------|------|--------| | `GET` | `/admin` | Redirect to edit or login | | `GET` | `/admin/login` | Login form | | `POST` | `/admin/login` | Authenticate; set session; redirect | | `GET` | `/admin/logout` | Clear session; redirect to `/` | | `GET` | `/admin/edit/` | Edit root album | | `GET` | `/admin/edit/*` | Edit a specific album | | `POST` | `/admin/edit/` | Save root album edits | | `POST` | `/admin/edit/*` | Save album edits | --- ## Request Flows ### Browsing an album (`GET /browse/SomeAlbum`) 1. `resolve_dir("SomeAlbum")` expands the path against `MEDIA_ROOT` and verifies it doesn't escape that root (path traversal guard). 2. `load_album(dir)` reads `album.json` (or returns defaults). 3. `halt 403` if the album is marked invisible and the visitor isn't an admin. 4. `child_albums(dir, data)` scans immediate subdirectories, loads each sub-album's `album.json`, calls `album_cover` for each, and builds the grid data. Hidden sub-albums are excluded for non-admins. 5. `album_files(dir, data)` scans media files, merges per-file metadata, filters hidden items, and sorts by `taken_at` if available. 6. The `ENTRIES` array (JSON-serialized) is embedded directly in the HTML so `album.js` can operate without any further server round-trips. 7. Sinatra renders `views/album.erb` wrapped in `views/layout.erb`. ### Serving a thumbnail (`GET /thumb/SomeAlbum/photo.jpg`) 1. Path traversal check via `resolve_file`. 2. Audio files return `public/img/audio.svg` immediately (no generation needed). 3. Visibility check: if `album.json` marks the file hidden and visitor isn't admin → `403`. 4. Check the thumbnail cache at `/opt/albumen/cache/thumbs/SomeAlbum/photo.jpg.th.jpg`. 5. Cache miss → `generate_thumb`: - **Image:** MiniMagick auto-orients (reads EXIF rotation), then thumbnail-scales to 300×300 with center-crop to guarantee a square. - **Video:** `ffmpeg` seeks to 2 seconds, extracts one frame, and scales/crops to 300×300. 6. Serve the cached JPEG. Thumbnails are never regenerated once created. To force regeneration, delete the cache file (or the entire `cache/thumbs/` tree — it is safe to wipe at any time). ### Opening a photo in the lightbox (client-side) The album page embeds the full `ENTRIES` array in a `