diff options
| -rw-r--r-- | DESIGN.md | 398 |
1 files changed, 398 insertions, 0 deletions
diff --git a/DESIGN.md b/DESIGN.md new file mode 100644 index 0000000..a4d9108 --- /dev/null +++ b/DESIGN.md @@ -0,0 +1,398 @@ +# Albumen — Design Document + +## Overview + +Albumen is a self-hosted photo and media album. The guiding principle is +simplicity: the filesystem *is* the database. Every directory under +`MEDIA_ROOT` is an album; every image, video, or audio file inside it is +a media item. The only metadata that can't be derived from the filesystem +is stored in a small `album.json` sidecar file that lives alongside the +media in each directory. + +There is no import step, no database to migrate, no ORM. Drop files into +a directory, run `update.rb`, and they are immediately browseable. + +--- + +## System Architecture + +``` +Browser / mobile + │ HTTPS + ▼ +Apache reverse proxy (192.168.10.1 / albumen.jots.org) + │ HTTP, forwards X-Forwarded-Proto + ▼ +Puma application server (127.0.0.1:4567) + │ Rack / Sinatra + ▼ +app.rb ──reads──► /var/albumen/ (media files + album.json sidecars) + ──reads──► /opt/albumen/cache/thumbs/ (generated thumbnails) + ──reads──► /opt/albumen/config.yml (password hash + session secret) +``` + +### Process model + +The app runs as the `albumen` system user under systemd +(`config/albumen.service`). Puma is configured for 1 worker with 4–8 +threads (`config/puma.rb`). Logs go to `/opt/albumen/log/`. On crash, +systemd restarts the process after 5 seconds. + +### Reverse proxy + +The proxy (Apache or nginx — a sample nginx config is in +`config/nginx-albumen.conf`) terminates HTTPS and forwards plain HTTP to +Puma on port 4567. It sets `X-Forwarded-Proto: https` so Sinatra's +redirect helpers produce correct `https://` URLs and +`Rack::Protection::HttpOrigin` sees the right scheme. `client_max_body_size` +is set to unlimited because large video files may be uploaded via rsync. + +--- + +## Directory Layout + +``` +/opt/albumen/ ← application root (code + config) + app.rb + config.ru + Gemfile / Gemfile.lock + config/ + albumen.service ← systemd unit file + puma.rb ← Puma config + nginx-albumen.conf ← sample reverse-proxy config + views/ + layout.erb ← site chrome (header, nav) + album.erb ← browse page (grid + lightbox) + slideshow.erb ← full-screen slideshow page + admin/ + login.erb + album.erb ← per-album edit form + public/ + css/style.css + js/album.js ← lightbox logic + js/slideshow.js ← slideshow logic + img/audio.svg ← placeholder thumbnail for audio files + scripts/ + update.rb ← post-upload scan/enrich script + set_password.rb ← bcrypt password setter + cache/thumbs/ ← generated thumbnail cache (mirrored path structure) + tmp/ ← Puma pid / state files + log/ ← Puma stdout / stderr logs + config.yml ← runtime secrets (not in git) + +/var/albumen/ ← media root (owned by albumen user) + album.json ← root-level sidecar (optional) + SomeAlbum/ + album.json ← per-album metadata sidecar + photo1.jpg + photo2.jpg + SubAlbum/ + album.json + photo3.jpg +``` + +--- + +## Dependencies + +### Ruby gems + +| Gem | Purpose | +|-----|---------| +| `sinatra ~> 4.0` | HTTP routing and ERB rendering | +| `puma ~> 6.4` | Multi-threaded Rack application server | +| `mini_magick ~> 4.12` | ImageMagick wrapper — thumbnail generation, EXIF-aware auto-orient | +| `mini_exiftool ~> 2.10` | Reads EXIF `DateTimeOriginal` / `CreateDate` for chronological sorting | +| `bcrypt ~> 3.1` | Password hashing for the single admin account | +| `rack-session ~> 2.0` | Cookie-based session support (required by Sinatra 4 separately) | + +### System tools + +| Tool | Purpose | +|------|---------| +| **ImageMagick** | Backing tool for MiniMagick — must be installed on the server | +| **ExifTool** | Backing tool for MiniExiftool — must be installed on the server | +| **ffmpeg** | Video thumbnail extraction (frame at 2 s) and duration probing via `ffprobe` | + +--- + +## Data Model — `album.json` + +Each directory may contain an `album.json`. If none exists, defaults are +used. The file is written atomically (write to a `.tmp` file, then +`File.rename`) so a crash mid-write never corrupts existing data. + +```json +{ + "title": "Hawaii 2004", + "description": "Optional free-text shown below the album title.", + "cover": "dscn0929.jpg", + "sort_reverse": false, + "visible": true, + "files": { + "dscn0929.jpg": { + "title": "Arrival at Kona", + "caption": "Ken at the airport, exhausted.", + "visible": true, + "taken_at": "2004-06-15T14:23:00", + "width": 2048, + "height": 1536 + } + } +} +``` + +**Top-level fields:** + +| Field | Default | Meaning | +|-------|---------|---------| +| `title` | directory name | Display name for the album | +| `description` | `null` | Optional paragraph shown under the title | +| `cover` | `null` (auto) | Filename of the cover image, or `"__random__"` | +| `sort_reverse` | `false` | Reverses the order of sub-album cards | +| `visible` | `true` | If `false`, hidden from non-admin visitors | + +**Per-file fields** (inside `"files"`): + +| Field | Default | Meaning | +|-------|---------|---------| +| `title` | filename | Display name used in the lightbox caption bar | +| `caption` | `null` | Optional descriptive text shown under the title | +| `visible` | `true` | If `false`, hidden from non-admin visitors | +| `taken_at` | `null` | ISO 8601 timestamp from EXIF; used for chronological sorting | +| `width` / `height` | `null` | Pixel dimensions recorded by `update.rb` | + +When `taken_at` is present on *any* file in an album, the entire album is +sorted chronologically. Albums with no `taken_at` data stay in filename +order. + +### Cover image selection + +`album_cover(dir, data)` resolves what image to display as a sub-album's +tile: + +1. If `cover` is a specific filename and that file exists → use it. +2. If `cover` is `"__random__"` → pick a random file from `cover_candidates`. +3. Otherwise (no cover set, or file missing) → use the first file from + `cover_candidates`. + +`cover_candidates(dir)` walks the entire directory tree recursively, so +albums that contain only sub-albums (no top-level photos) still produce a +cover. + +--- + +## HTTP Routes + +### Public + +| Method | Path | Action | +|--------|------|--------| +| `GET` | `/` | Redirect → `/browse/` | +| `GET` | `/browse/` | Root album grid | +| `GET` | `/browse/*` | Sub-album grid at that path | +| `GET` | `/thumb/*` | Serve (or generate) a 300×300 JPEG thumbnail | +| `GET` | `/media/*` | Serve the original media file | +| `GET` | `/slideshow/` | Full-screen slideshow for the root | +| `GET` | `/slideshow/*` | Full-screen slideshow for a sub-album | + +### Admin + +| Method | Path | Action | +|--------|------|--------| +| `GET` | `/admin` | Redirect to edit or login | +| `GET` | `/admin/login` | Login form | +| `POST` | `/admin/login` | Authenticate; set session; redirect | +| `GET` | `/admin/logout` | Clear session; redirect to `/` | +| `GET` | `/admin/edit/` | Edit root album | +| `GET` | `/admin/edit/*` | Edit a specific album | +| `POST` | `/admin/edit/` | Save root album edits | +| `POST` | `/admin/edit/*` | Save album edits | + +--- + +## Request Flows + +### Browsing an album (`GET /browse/SomeAlbum`) + +1. `resolve_dir("SomeAlbum")` expands the path against `MEDIA_ROOT` and + verifies it doesn't escape that root (path traversal guard). +2. `load_album(dir)` reads `album.json` (or returns defaults). +3. `halt 403` if the album is marked invisible and the visitor isn't an admin. +4. `child_albums(dir, data)` scans immediate subdirectories, loads each + sub-album's `album.json`, calls `album_cover` for each, and builds the + grid data. Hidden sub-albums are excluded for non-admins. +5. `album_files(dir, data)` scans media files, merges per-file metadata, + filters hidden items, and sorts by `taken_at` if available. +6. The `ENTRIES` array (JSON-serialized) is embedded directly in the HTML + so `album.js` can operate without any further server round-trips. +7. Sinatra renders `views/album.erb` wrapped in `views/layout.erb`. + +### Serving a thumbnail (`GET /thumb/SomeAlbum/photo.jpg`) + +1. Path traversal check via `resolve_file`. +2. Audio files return `public/img/audio.svg` immediately (no generation needed). +3. Visibility check: if `album.json` marks the file hidden and visitor isn't + admin → `403`. +4. Check the thumbnail cache at + `/opt/albumen/cache/thumbs/SomeAlbum/photo.jpg.th.jpg`. +5. Cache miss → `generate_thumb`: + - **Image:** MiniMagick auto-orients (reads EXIF rotation), then + thumbnail-scales to 300×300 with center-crop to guarantee a square. + - **Video:** `ffmpeg` seeks to 2 seconds, extracts one frame, and + scales/crops to 300×300. +6. Serve the cached JPEG. + +Thumbnails are never regenerated once created. To force regeneration, +delete the cache file (or the entire `cache/thumbs/` tree — it is safe to +wipe at any time). + +### Opening a photo in the lightbox (client-side) + +The album page embeds the full `ENTRIES` array in a `<script>` block, so +no server call is needed to open or navigate the lightbox. + +1. Click on a card → `openLightbox(i)` → `renderLightbox()`. +2. All three media elements (`#lb-img`, `#lb-video`, `#lb-audio`) get + `.hidden` (i.e. `display:none !important`). +3. The appropriate element for the media type has `.hidden` removed and its + `src` set. For video and audio, playback starts immediately. +4. The URL hash is updated to `#photo=filename` so the link is shareable. + On page load, if that hash is present, the matching photo is opened + automatically. +5. Navigation: ← → arrow keys, on-screen buttons, or touch swipe (>50 px). +6. Closing: `Escape`, the ✕ button, or clicking outside the media. + Video/audio is paused; the hash is stripped from the URL. + +### Slideshow (`GET /slideshow/SomeAlbum`) + +The slideshow page is a separate full-screen layout. `SS_ENTRIES` (images +and videos only — audio is excluded in `slideshow_view`) is embedded in the +page. `slideshow.js`: + +- Preloads the next image before cross-fading so transitions are smooth. + Videos can't be preloaded; they fade out first, then swap. +- Cross-fade is a CSS opacity transition (500 ms); JS adds/removes + `.fading` and uses a matching `setTimeout`. +- A `<input type="number">` lets the visitor set the interval (default 4 s, + range 1–60 s); it is read live on each advance so changes take effect + without restarting. +- Controls: Prev / Pause-Play / Next buttons, ← → arrow keys, spacebar, + touch swipe. + +### Admin authentication + +A single bcrypt-hashed password is stored in `config.yml` +(`admin_password_hash`). `POST /admin/login` compares the submitted +password against the hash using `BCrypt::Password#==`. On success, +`session[:admin] = true` is set in an encrypted cookie. All admin routes +call `require_admin!` which halts with the login form on failure. + +The `return_to` parameter on the login form lets the visitor be redirected +back to the page they were trying to reach. + +### Saving album edits (`POST /admin/edit/SomeAlbum`) + +1. `load_album(dir)` reads current state. +2. Top-level fields (title, description, cover, sort_reverse, visible) are + updated from form params. `blank_to_nil` converts empty strings to `nil` + so omitted optional fields don't get stored as `""`. +3. Per-file fields (title, caption, visible) are updated by iterating the + `file_title[name]` / `file_caption[name]` / `file_visible[name]` param + hashes. +4. `atomic_write` writes the updated JSON to a `.tmp` file and renames it + into place. +5. Redirect back to the same edit page (PRG pattern — prevents double-submit + on reload). + +--- + +## The `update.rb` Script + +Run this after copying new media files onto the server. It is safe to +re-run at any time — all operations are idempotent. + +```bash +ruby /opt/albumen/scripts/update.rb [optional/subdir] +``` + +**What it does, per directory:** + +1. Reads the existing `album.json` (or starts from defaults). +2. Removes stale `files` entries for deleted files. +3. For each media file: + - **Images:** reads EXIF `DateTimeOriginal` (or `CreateDate`) and stores + it as `taken_at`; reads pixel dimensions. Both are skipped if already + recorded. + - **Videos:** runs `ffprobe` to record duration. Skipped if already + recorded. + - **All non-audio:** generates a thumbnail if one doesn't already exist. +4. Re-reads `album.json` immediately before writing (to catch any admin + saves that happened concurrently) and preserves all admin-controlled + fields (`title`, `description`, `cover`, `sort_reverse`, `visible`, + per-file `title`/`caption`/`visible`). +5. Writes the updated JSON atomically. + +**Ownership:** When run as root (the typical case after an rsync), the +script calls `FileUtils.chown_R` to transfer ownership of the media tree +to the `albumen` user so the web app can read the files. + +--- + +## Security + +**Path traversal:** `resolve_dir` and `resolve_file` call `File.expand_path` +to canonicalise the path (resolving any `..` components) and then assert +that the result starts with `MEDIA_ROOT + "/"`. Any request that would +escape the media root gets a `404`. + +**Visibility enforcement:** The `/thumb/*` and `/media/*` routes check +`album.json` visibility for every request. A file marked `visible: false` +returns `403` unless the session has admin privileges. The browse routes +filter hidden albums and files from the data sent to the browser. + +**Admin password:** Stored as a bcrypt hash in `config.yml` (not in git). +The plaintext password is never persisted. Sessions use an encrypted cookie +with a random secret also stored in `config.yml`. + +**HTTPS:** Terminated at the reverse proxy. The app itself only listens on +`127.0.0.1:4567` and is not reachable directly from the network. + +--- + +## Deployment + +### After code changes + +```bash +scp -r /home/ken/albumen/. root@192.168.10.245:/opt/albumen/ +ssh root@192.168.10.245 'chown -R albumen:albumen /opt/albumen && systemctl restart albumen' +``` + +### After uploading new photos + +```bash +# On the server (run as root): +ruby /opt/albumen/scripts/update.rb [optional-subdir] +``` + +The script fixes ownership automatically when run as root, so it can be +called immediately after an rsync without a separate chown step. + +### Thumbnail cache + +`/opt/albumen/cache/thumbs/` mirrors the `MEDIA_ROOT` path structure. +The cache is entirely derived data — it can be deleted at any time and will +be rebuilt on demand as visitors browse. To pre-warm the cache after a bulk +upload, run `update.rb` (it generates thumbnails as part of its scan). + +### Configuration (`config.yml`) + +```yaml +admin_password_hash: "$2a$12$..." # set with scripts/set_password.rb +session_secret: "random-hex-string" # set once at install time +``` + +This file is not tracked in git. All three paths (`MEDIA_ROOT`, +`CACHE_ROOT`, `CONFIG_PATH`) can be overridden with environment variables +for local development. |
