Separate face detection into standalone daemon

- Strip all face code from update.rb; add shared log helper writing to /opt/albumen/log/albumen.log with [update] prefix. update.rb now owns only album.json; face_daemon.rb owns faces.json. - New scripts/face_daemon.rb: polls MEDIA_ROOT for unprocessed images, calls faces.py in batches, writes per-directory faces.json sidecars atomically. Graceful SIGTERM/SIGINT shutdown between directories. - New config/face_daemon.service: systemd unit running as albumen user, Restart=on-failure, logs via SyslogIdentifier=albumen-faces. - app.rb: add FACES_ENABLED constant; load_faces() helper reads faces.json; album_files() merges face data into each entry as :faces field. - Update README.md and DESIGN.md to document the new daemon architecture, faces.json schema, and service management commands. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
author: Ken D'Ambrosio <ken@jots.org> 2026-06-08 18:36:07 +0000
committer: Ken D'Ambrosio <ken@jots.org> 2026-06-08 18:36:07 +0000
commit: 625b3d5176f2c274e91fcf28bda8e45cc0477722 (patch)
tree: 6ca16ad6f4a830b65dcddbd78ad7e7a2f1655682 /DESIGN.md
parent: ecc872a1fd43c0863e3171a1faf533adc3e3a4c5 (diff)
1 files changed, 108 insertions, 40 deletions
diff --git a/DESIGN.md b/DESIGN.md
index a7f368c..79399f6 100644
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -26,18 +26,29 @@ Apache reverse proxy  (192.168.10.1 / albumen.jots.org)
 Puma application server  (127.0.0.1:4567)
       │  Rack / Sinatra
       ▼
-app.rb  ──reads──►  /var/albumen/   (media files + album.json sidecars)
-        ──reads──►  /opt/albumen/cache/thumbs/   (generated thumbnails)
-        ──reads──►  /opt/albumen/config.yml       (password hash + session secret)
+app.rb        ──reads──►  /var/albumen/   (media files + album.json + faces.json sidecars)
+              ──reads──►  /opt/albumen/cache/thumbs/   (generated thumbnails)
+              ──reads──►  /opt/albumen/config.yml       (password hash + session secret)
+
+face_daemon.rb  ──reads──►  /var/albumen/   (image files)
+                ──writes─►  /var/albumen/**/faces.json  (per-directory face sidecar)
 ```
 
 ### Process model
 
-The app runs as the `albumen` system user under systemd
+The web app runs as the `albumen` system user under systemd
 (`config/albumen.service`). Puma is configured for 1 worker with 4–8
 threads (`config/puma.rb`). Logs go to `/opt/albumen/log/`. On crash,
 systemd restarts the process after 5 seconds.
 
+The face detection daemon (`config/face_daemon.service`) runs as a
+separate systemd service under the same `albumen` user. It polls
+`MEDIA_ROOT` every `poll_interval` seconds (default 300), processes
+any images not yet in a `faces.json` sidecar, and writes results
+atomically. It never touches `album.json`, so there is no write
+contention with `update.rb`. All process output is written to the
+shared log at `/opt/albumen/log/albumen.log`.
+
 ### Reverse proxy
 
 The proxy (Apache or nginx — a sample nginx config is in
@@ -57,7 +68,8 @@ is set to unlimited because large video files may be uploaded via rsync.
   config.ru
   Gemfile / Gemfile.lock
   config/
-    albumen.service     ← systemd unit file
+    albumen.service     ← systemd unit file for the web app
+    face_daemon.service ← systemd unit file for the face detection daemon
     puma.rb             ← Puma config
     nginx-albumen.conf  ← sample reverse-proxy config
   views/
@@ -74,6 +86,8 @@ is set to unlimited because large video files may be uploaded via rsync.
     img/audio.svg       ← placeholder thumbnail for audio files
   scripts/
     update.rb           ← post-upload scan/enrich script
+    face_daemon.rb      ← face detection daemon (polls for new images, writes faces.json)
+    faces.py            ← dlib CNN face detection helper called by the daemon
     set_password.rb     ← PBKDF2-SHA256 password setter
   cache/thumbs/         ← generated thumbnail cache (mirrored path structure)
   tmp/                  ← Puma pid / state files
@@ -83,11 +97,13 @@ is set to unlimited because large video files may be uploaded via rsync.
 /var/albumen/           ← media root (owned by albumen user)
   album.json            ← root-level sidecar (optional)
   SomeAlbum/
-    album.json          ← per-album metadata sidecar
+    album.json          ← per-album metadata sidecar (owned by update.rb)
+    faces.json          ← per-album face data sidecar (owned by face_daemon.rb)
     photo1.jpg
     photo2.jpg
     SubAlbum/
       album.json
+      faces.json
       photo3.jpg
 ```
 
@@ -190,7 +206,6 @@ used. The file is written atomically (write to a `.tmp` file, then
 | `taken_at` | `null` | ISO 8601 timestamp from EXIF; used for chronological sorting |
 | `width` / `height` | `null` | Pixel dimensions recorded by `update.rb` |
 | `exif_absent` | `null` | Set to `true` by `update.rb` when exiftool found no metadata; skips re-extraction on future rescans |
-| `faces` | `null` | Set by `update.rb` when `faces.enabled`; array of `{"box": [top,right,bottom,left], "encoding": [128 floats]}` per detected face; `[]` means processed with no faces found; `null` means not yet processed |
 
 When `taken_at` is present on *any* file in an album, the entire album is
 sorted chronologically. Albums with no `taken_at` data stay in filename
@@ -417,57 +432,110 @@ to the `albumen` user so the web app can read the files.
 Enabled by setting `faces.enabled: true` in `config.yml`. When disabled,
 no Python is invoked and no face data is stored or displayed.
 
+### Architecture: separate daemon
+
+Face detection runs in a completely separate process (`scripts/face_daemon.rb`,
+managed by `config/face_daemon.service`) and is entirely decoupled from
+`update.rb`. This design keeps the two operations from conflicting:
+
+- **`update.rb`** owns `album.json` in each directory. It indexes media,
+  extracts EXIF data, and generates thumbnails as fast as possible so
+  newly uploaded photos are browseable immediately.
+- **`face_daemon.rb`** owns `faces.json` in each directory. It runs
+  continuously in the background, processing images that haven't been
+  detected yet. There is no file locking or write contention between the
+  two processes.
+
+### Data model — `faces.json`
+
+Each directory gets a `faces.json` sidecar written by the daemon:
+
+```json
+{
+  "photo1.jpg": [
+    {"box": [top, right, bottom, left], "encoding": [128 floats]},
+    ...
+  ],
+  "photo2.jpg": [],
+  "photo3.jpg": null
+}
+```
+
+| Value | Meaning |
+|-------|---------|
+| key absent | not yet processed |
+| `null` | error during last detection attempt; will retry |
+| `[]` | processed successfully, no faces found |
+| `[{box, encoding}, ...]` | one entry per detected face |
+
+`app.rb` reads `faces.json` via `load_faces(dir)` and merges face data
+into each entry's `:faces` field. The field is `nil` (absent key in
+`faces.json`) when the daemon hasn't processed the image yet.
+
 ### Detection pipeline
 
-`update.rb` calls `enrich_faces` for each image file where `meta['faces']`
-is `nil` (not yet processed). It shells out to `scripts/faces.py`, which:
+The daemon shells out to `scripts/faces.py`, which uses the **CNN model**
+(`model="cnn"`) for higher accuracy. The CNN model detects:
+- Faces at angles up to ~45° profile
+- Small faces in group photos
+- Faces in non-ideal lighting
 
-1. Loads the image with `face_recognition.load_image_file` (handles JPEG,
-   PNG, HEIC, etc. via Pillow).
-2. Runs `face_locations(model="hog")` — the HOG model is fast on CPU and
-   accurate for frontal/near-frontal faces. (The CNN model is more accurate
-   but requires a GPU to be practical.)
-3. For each detected location, calls `face_encodings` to produce a
-   128-dimensional L2-normalised embedding vector.
-4. Prints a JSON array to stdout; on any error prints `[]` so `update.rb`
-   always gets valid JSON.
+Trade-off: CNN is ~10–30× slower than HOG on CPU. The daemon compensates
+with `--workers N` (default 20) — dlib releases the Python GIL during
+C++ inference, so threads achieve genuine CPU parallelism.
 
-The result is stored as `meta['faces']` in `album.json`. An empty array
-`[]` means "processed, no faces found" and prevents re-processing. A `null`
-value means "not yet processed."
+`faces.py` accepts a batch of image paths and returns a JSON dict mapping
+each path to its result list. Null for a path means detection failed
+(file unreadable or corrupt); the daemon marks that entry `null` in
+`faces.json` so it is retried on the next pass.
 
 Encodings are stored in full (128 floats each) to allow re-clustering
 without reprocessing all images.
 
-### Clustering and people management (planned)
+### Daemon operation
 
-A second pass (`scripts/cluster_faces.rb`) will:
+```
+loop:
+  for each directory in MEDIA_ROOT (recursive):
+    pending = images whose name is absent from faces.json (or null)
+    if pending not empty:
+      call faces.py --workers 20 <pending paths>
+      merge results into faces.json (atomic write)
+  sleep POLL_INTERVAL seconds (default 300, in 1-second increments for prompt shutdown)
+```
 
-1. Walk all `album.json` files and collect every `{encoding, source_file,
-   box}` tuple.
-2. Cluster them with a threshold distance (~0.6 in L2 space, empirically
-   good for dlib encodings).
-3. Write `/var/albumen/people.json` — a map of stable UUIDs → cluster
-   metadata (name, representative encoding, member list).
+SIGTERM / SIGINT trigger graceful shutdown between directories.
 
-The admin `/admin/people` UI will let you:
-- Name unidentified clusters ("Who is this?").
-- Merge two clusters that are the same person.
-- Remove a photo from a cluster (false positive).
+### Configuration
 
-Public `/people` and `/people/:id` routes will let any visitor browse by
-person.
+```yaml
+faces:
+  enabled: true
+  workers: 20         # threads passed to faces.py
+  poll_interval: 300  # seconds between full-tree sweeps
+```
 
 ### Performance notes
 
-- HOG face detection: ~0.5–2 s per image on a single CPU core.
-- A library of 10,000 images takes ~3–6 hours to index fully, but the
-  sentinel-based skip means subsequent `update.rb` runs only process new
-  photos.
-- Encodings stored in `album.json` are ~3.5 KB per face. A library with
+- CNN face detection with 20 workers: ~4–6 images/minute on a 64-core CPU.
+- A library of ~20,000 photos takes roughly 2.5–3.5 days on initial pass.
+- Subsequent daemon passes only process new photos.
+- Encodings stored in `faces.json` are ~3.5 KB per face. A library with
   an average of 2 faces per photo adds ~70 MB of JSON across 10,000 photos
   — negligible.
 
+### Clustering and people management (planned)
+
+A second pass (`scripts/cluster_faces.rb`) will:
+
+1. Walk all `faces.json` files and collect every `{encoding, source_file, box}` tuple.
+2. Cluster them with a threshold distance (~0.6 in L2 space, empirically good for dlib encodings).
+3. Write `/var/albumen/people.json` — a map of stable UUIDs → cluster metadata.
+
+The admin `/admin/people` UI will let you name clusters, merge duplicates,
+and remove false positives. Public `/people` routes will allow browsing by
+person.
+
 ---
 
 ## Security
author	Ken D'Ambrosio <ken@jots.org>	2026-06-08 18:36:07 +0000
committer	Ken D'Ambrosio <ken@jots.org>	2026-06-08 18:36:07 +0000
commit	625b3d5176f2c274e91fcf28bda8e45cc0477722 (patch)
tree	6ca16ad6f4a830b65dcddbd78ad7e7a2f1655682 /DESIGN.md
parent	ecc872a1fd43c0863e3171a1faf533adc3e3a4c5 (diff)