Separate face detection into standalone daemon

- Strip all face code from update.rb; add shared log helper writing to /opt/albumen/log/albumen.log with [update] prefix. update.rb now owns only album.json; face_daemon.rb owns faces.json. - New scripts/face_daemon.rb: polls MEDIA_ROOT for unprocessed images, calls faces.py in batches, writes per-directory faces.json sidecars atomically. Graceful SIGTERM/SIGINT shutdown between directories. - New config/face_daemon.service: systemd unit running as albumen user, Restart=on-failure, logs via SyslogIdentifier=albumen-faces. - app.rb: add FACES_ENABLED constant; load_faces() helper reads faces.json; album_files() merges face data into each entry as :faces field. - Update README.md and DESIGN.md to document the new daemon architecture, faces.json schema, and service management commands. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
author: Ken D'Ambrosio <ken@jots.org> 2026-06-08 18:36:07 +0000
committer: Ken D'Ambrosio <ken@jots.org> 2026-06-08 18:36:07 +0000
commit: 625b3d5176f2c274e91fcf28bda8e45cc0477722 (patch)
tree: 6ca16ad6f4a830b65dcddbd78ad7e7a2f1655682
parent: ecc872a1fd43c0863e3171a1faf533adc3e3a4c5 (diff)
6 files changed, 355 insertions, 126 deletions
diff --git a/DESIGN.md b/DESIGN.md
index a7f368c..79399f6 100644
--- a/DESIGN.md
+++ b/DESIGN.md
@@ -26,18 +26,29 @@ Apache reverse proxy  (192.168.10.1 / albumen.jots.org)
 Puma application server  (127.0.0.1:4567)
       │  Rack / Sinatra
       ▼
-app.rb  ──reads──►  /var/albumen/   (media files + album.json sidecars)
-        ──reads──►  /opt/albumen/cache/thumbs/   (generated thumbnails)
-        ──reads──►  /opt/albumen/config.yml       (password hash + session secret)
+app.rb        ──reads──►  /var/albumen/   (media files + album.json + faces.json sidecars)
+              ──reads──►  /opt/albumen/cache/thumbs/   (generated thumbnails)
+              ──reads──►  /opt/albumen/config.yml       (password hash + session secret)
+
+face_daemon.rb  ──reads──►  /var/albumen/   (image files)
+                ──writes─►  /var/albumen/**/faces.json  (per-directory face sidecar)
 ```
 
 ### Process model
 
-The app runs as the `albumen` system user under systemd
+The web app runs as the `albumen` system user under systemd
 (`config/albumen.service`). Puma is configured for 1 worker with 4–8
 threads (`config/puma.rb`). Logs go to `/opt/albumen/log/`. On crash,
 systemd restarts the process after 5 seconds.
 
+The face detection daemon (`config/face_daemon.service`) runs as a
+separate systemd service under the same `albumen` user. It polls
+`MEDIA_ROOT` every `poll_interval` seconds (default 300), processes
+any images not yet in a `faces.json` sidecar, and writes results
+atomically. It never touches `album.json`, so there is no write
+contention with `update.rb`. All process output is written to the
+shared log at `/opt/albumen/log/albumen.log`.
+
 ### Reverse proxy
 
 The proxy (Apache or nginx — a sample nginx config is in
@@ -57,7 +68,8 @@ is set to unlimited because large video files may be uploaded via rsync.
   config.ru
   Gemfile / Gemfile.lock
   config/
-    albumen.service     ← systemd unit file
+    albumen.service     ← systemd unit file for the web app
+    face_daemon.service ← systemd unit file for the face detection daemon
     puma.rb             ← Puma config
     nginx-albumen.conf  ← sample reverse-proxy config
   views/
@@ -74,6 +86,8 @@ is set to unlimited because large video files may be uploaded via rsync.
     img/audio.svg       ← placeholder thumbnail for audio files
   scripts/
     update.rb           ← post-upload scan/enrich script
+    face_daemon.rb      ← face detection daemon (polls for new images, writes faces.json)
+    faces.py            ← dlib CNN face detection helper called by the daemon
     set_password.rb     ← PBKDF2-SHA256 password setter
   cache/thumbs/         ← generated thumbnail cache (mirrored path structure)
   tmp/                  ← Puma pid / state files
@@ -83,11 +97,13 @@ is set to unlimited because large video files may be uploaded via rsync.
 /var/albumen/           ← media root (owned by albumen user)
   album.json            ← root-level sidecar (optional)
   SomeAlbum/
-    album.json          ← per-album metadata sidecar
+    album.json          ← per-album metadata sidecar (owned by update.rb)
+    faces.json          ← per-album face data sidecar (owned by face_daemon.rb)
     photo1.jpg
     photo2.jpg
     SubAlbum/
       album.json
+      faces.json
       photo3.jpg
 ```
 
@@ -190,7 +206,6 @@ used. The file is written atomically (write to a `.tmp` file, then
 | `taken_at` | `null` | ISO 8601 timestamp from EXIF; used for chronological sorting |
 | `width` / `height` | `null` | Pixel dimensions recorded by `update.rb` |
 | `exif_absent` | `null` | Set to `true` by `update.rb` when exiftool found no metadata; skips re-extraction on future rescans |
-| `faces` | `null` | Set by `update.rb` when `faces.enabled`; array of `{"box": [top,right,bottom,left], "encoding": [128 floats]}` per detected face; `[]` means processed with no faces found; `null` means not yet processed |
 
 When `taken_at` is present on *any* file in an album, the entire album is
 sorted chronologically. Albums with no `taken_at` data stay in filename
@@ -417,57 +432,110 @@ to the `albumen` user so the web app can read the files.
 Enabled by setting `faces.enabled: true` in `config.yml`. When disabled,
 no Python is invoked and no face data is stored or displayed.
 
+### Architecture: separate daemon
+
+Face detection runs in a completely separate process (`scripts/face_daemon.rb`,
+managed by `config/face_daemon.service`) and is entirely decoupled from
+`update.rb`. This design keeps the two operations from conflicting:
+
+- **`update.rb`** owns `album.json` in each directory. It indexes media,
+  extracts EXIF data, and generates thumbnails as fast as possible so
+  newly uploaded photos are browseable immediately.
+- **`face_daemon.rb`** owns `faces.json` in each directory. It runs
+  continuously in the background, processing images that haven't been
+  detected yet. There is no file locking or write contention between the
+  two processes.
+
+### Data model — `faces.json`
+
+Each directory gets a `faces.json` sidecar written by the daemon:
+
+```json
+{
+  "photo1.jpg": [
+    {"box": [top, right, bottom, left], "encoding": [128 floats]},
+    ...
+  ],
+  "photo2.jpg": [],
+  "photo3.jpg": null
+}
+```
+
+| Value | Meaning |
+|-------|---------|
+| key absent | not yet processed |
+| `null` | error during last detection attempt; will retry |
+| `[]` | processed successfully, no faces found |
+| `[{box, encoding}, ...]` | one entry per detected face |
+
+`app.rb` reads `faces.json` via `load_faces(dir)` and merges face data
+into each entry's `:faces` field. The field is `nil` (absent key in
+`faces.json`) when the daemon hasn't processed the image yet.
+
 ### Detection pipeline
 
-`update.rb` calls `enrich_faces` for each image file where `meta['faces']`
-is `nil` (not yet processed). It shells out to `scripts/faces.py`, which:
+The daemon shells out to `scripts/faces.py`, which uses the **CNN model**
+(`model="cnn"`) for higher accuracy. The CNN model detects:
+- Faces at angles up to ~45° profile
+- Small faces in group photos
+- Faces in non-ideal lighting
 
-1. Loads the image with `face_recognition.load_image_file` (handles JPEG,
-   PNG, HEIC, etc. via Pillow).
-2. Runs `face_locations(model="hog")` — the HOG model is fast on CPU and
-   accurate for frontal/near-frontal faces. (The CNN model is more accurate
-   but requires a GPU to be practical.)
-3. For each detected location, calls `face_encodings` to produce a
-   128-dimensional L2-normalised embedding vector.
-4. Prints a JSON array to stdout; on any error prints `[]` so `update.rb`
-   always gets valid JSON.
+Trade-off: CNN is ~10–30× slower than HOG on CPU. The daemon compensates
+with `--workers N` (default 20) — dlib releases the Python GIL during
+C++ inference, so threads achieve genuine CPU parallelism.
 
-The result is stored as `meta['faces']` in `album.json`. An empty array
-`[]` means "processed, no faces found" and prevents re-processing. A `null`
-value means "not yet processed."
+`faces.py` accepts a batch of image paths and returns a JSON dict mapping
+each path to its result list. Null for a path means detection failed
+(file unreadable or corrupt); the daemon marks that entry `null` in
+`faces.json` so it is retried on the next pass.
 
 Encodings are stored in full (128 floats each) to allow re-clustering
 without reprocessing all images.
 
-### Clustering and people management (planned)
+### Daemon operation
 
-A second pass (`scripts/cluster_faces.rb`) will:
+```
+loop:
+  for each directory in MEDIA_ROOT (recursive):
+    pending = images whose name is absent from faces.json (or null)
+    if pending not empty:
+      call faces.py --workers 20 <pending paths>
+      merge results into faces.json (atomic write)
+  sleep POLL_INTERVAL seconds (default 300, in 1-second increments for prompt shutdown)
+```
 
-1. Walk all `album.json` files and collect every `{encoding, source_file,
-   box}` tuple.
-2. Cluster them with a threshold distance (~0.6 in L2 space, empirically
-   good for dlib encodings).
-3. Write `/var/albumen/people.json` — a map of stable UUIDs → cluster
-   metadata (name, representative encoding, member list).
+SIGTERM / SIGINT trigger graceful shutdown between directories.
 
-The admin `/admin/people` UI will let you:
-- Name unidentified clusters ("Who is this?").
-- Merge two clusters that are the same person.
-- Remove a photo from a cluster (false positive).
+### Configuration
 
-Public `/people` and `/people/:id` routes will let any visitor browse by
-person.
+```yaml
+faces:
+  enabled: true
+  workers: 20         # threads passed to faces.py
+  poll_interval: 300  # seconds between full-tree sweeps
+```
 
 ### Performance notes
 
-- HOG face detection: ~0.5–2 s per image on a single CPU core.
-- A library of 10,000 images takes ~3–6 hours to index fully, but the
-  sentinel-based skip means subsequent `update.rb` runs only process new
-  photos.
-- Encodings stored in `album.json` are ~3.5 KB per face. A library with
+- CNN face detection with 20 workers: ~4–6 images/minute on a 64-core CPU.
+- A library of ~20,000 photos takes roughly 2.5–3.5 days on initial pass.
+- Subsequent daemon passes only process new photos.
+- Encodings stored in `faces.json` are ~3.5 KB per face. A library with
   an average of 2 faces per photo adds ~70 MB of JSON across 10,000 photos
   — negligible.
 
+### Clustering and people management (planned)
+
+A second pass (`scripts/cluster_faces.rb`) will:
+
+1. Walk all `faces.json` files and collect every `{encoding, source_file, box}` tuple.
+2. Cluster them with a threshold distance (~0.6 in L2 space, empirically good for dlib encodings).
+3. Write `/var/albumen/people.json` — a map of stable UUIDs → cluster metadata.
+
+The admin `/admin/people` UI will let you name clusters, merge duplicates,
+and remove false positives. Public `/people` routes will allow browsing by
+person.
+
 ---
 
 ## Security
diff --git a/README.md b/README.md
index 8167c0b..c7a66fd 100644
--- a/README.md
+++ b/README.md
@@ -36,8 +36,9 @@ back end, plain HTML/CSS/JS front end.  Live at **https://albumen.jots.org**.
   **Force rescan all** checkbox bypasses the sentinel and rescans every directory
 
 ### Facial recognition (opt-in)
-- Detects faces in photos and stores 128-D embeddings alongside each image
-- Powered by [face_recognition](https://github.com/ageitgey/face_recognition) (dlib/HOG, CPU-only)
+- Detects faces in photos and stores 128-D embeddings in per-directory `faces.json` sidecar files
+- Powered by [face_recognition](https://github.com/ageitgey/face_recognition) (dlib CNN model, CPU-only)
+- Runs as a background daemon (`face_daemon.service`), completely decoupled from the update script
 - People management and browse-by-person UI in progress
 
 ### Media support
@@ -176,7 +177,9 @@ The PBKDF2-SHA256 hash is stored in `/opt/albumen/config.yml` (readable only by
 
 ## Facial recognition setup
 
-Face detection is opt-in. Install once, then enable in `config.yml`.
+Face detection is opt-in and runs as a **separate background daemon** so it
+never slows down the update script. Photos are indexed and browseable
+immediately after upload; faces are detected asynchronously in the background.
 
 ### 1. Install Python dependencies (server, ~30 min first time)
 
@@ -193,31 +196,47 @@ Add to `/opt/albumen/config.yml`:
 ```yaml
 faces:
   enabled: true
+  workers: 20         # parallel threads (set to ~nproc/3 to leave headroom)
+  poll_interval: 300  # seconds between full sweeps
 ```
 
-### 3. Run the update script
+### 3. Install and start the daemon
 
 ```bash
-ruby /opt/albumen/scripts/update.rb
+cp /opt/albumen/config/face_daemon.service /etc/systemd/system/
+systemctl daemon-reload
+systemctl enable face_daemon
+systemctl start face_daemon
 ```
 
-The update script will now detect faces in images and store bounding boxes and
-embeddings in each album's `album.json`. This is a one-time cost per image;
-subsequent runs skip already-processed photos.
+The daemon polls the media tree every `poll_interval` seconds, processes any
+images not yet in a `faces.json` sidecar, and logs to
+`/opt/albumen/log/albumen.log` with a `[faces]` prefix.
+
+Initial detection of a large library (~20,000 photos with CNN model on a
+64-core CPU) takes roughly 2.5–3.5 days. Only new photos are processed on
+subsequent passes.
 
 ---
 
 ## Service management
 
 ```bash
+# Web app
 systemctl status albumen          # is it running?
 systemctl restart albumen         # restart (e.g. after editing app.rb)
 journalctl -u albumen -f          # live service logs
-tail -f /opt/albumen/log/puma.stdout.log   # Puma access log
-tail -f /opt/albumen/log/puma.stderr.log   # Puma error log
+
+# Face detection daemon
+systemctl status face_daemon      # is it running?
+systemctl restart face_daemon     # restart the daemon
+journalctl -u albumen-faces -f    # live daemon logs (systemd journal)
+
+# Shared activity log (both update.rb and face_daemon.rb write here)
+tail -f /opt/albumen/log/albumen.log
 ```
 
-The service runs as the `albumen` user.  App code lives in `/opt/albumen/`.
+Both services run as the `albumen` user.  App code lives in `/opt/albumen/`.
 
 ---
 
diff --git a/app.rb b/app.rb
index 6b11d5d..a923b4d 100644
--- a/app.rb
+++ b/app.rb
@@ -23,7 +23,8 @@ VIDEO_EXTS = %w[mp4 mov avi mkv webm m4v ogv].freeze
 AUDIO_EXTS = %w[mp3 flac ogg wav m4a aac].freeze
 MEDIA_EXTS = (IMAGE_EXTS + VIDEO_EXTS + AUDIO_EXTS).freeze
 
-APP_CONFIG = (File.exist?(CONFIG_PATH) ? YAML.load_file(CONFIG_PATH, symbolize_names: true) : {}).freeze
+APP_CONFIG     = (File.exist?(CONFIG_PATH) ? YAML.load_file(CONFIG_PATH, symbolize_names: true) : {}).freeze
+FACES_ENABLED  = (APP_CONFIG.dig(:faces, :enabled) == true).freeze
 
 # ── Sinatra config ─────────────────────────────────────────────────────────────
 
@@ -83,7 +84,17 @@ helpers do
     { 'files' => {}, 'visible' => true }
   end
 
+  def load_faces(dir)
+    path = File.join(dir, 'faces.json')
+    return {} unless File.exist?(path)
+    JSON.parse(File.read(path))
+  rescue JSON::ParserError
+    {}
+  end
+
   def album_files(dir, data)
+    face_data = FACES_ENABLED ? load_faces(dir) : {}
+
     files = Dir.children(dir)
                .sort
                .select { |n| MEDIA_EXTS.include?(File.extname(n).downcase.delete_prefix('.')) }
@@ -107,6 +118,7 @@ helpers do
         shutter:       meta['shutter'],
         iso:           meta['iso'],
         transcoded_to: meta['transcoded_to'],
+        faces:         face_data[name],
       }
     end
 
diff --git a/config/face_daemon.service b/config/face_daemon.service
new file mode 100644
index 0000000..4babccc
--- /dev/null
+++ b/config/face_daemon.service
@@ -0,0 +1,27 @@
+[Unit]
+Description=Albumen face detection daemon
+After=network.target albumen.service
+Wants=albumen.service
+
+[Service]
+Type=simple
+User=albumen
+Group=albumen
+WorkingDirectory=/opt/albumen
+
+Environment=MEDIA_ROOT=/var/albumen
+Environment=CONFIG_PATH=/opt/albumen/config.yml
+Environment=LOG_PATH=/opt/albumen/log/albumen.log
+Environment=VENV_PYTHON=/opt/albumen/venv/bin/python3
+Environment=FACES_SCRIPT=/opt/albumen/scripts/faces.py
+
+ExecStart=/usr/bin/ruby /opt/albumen/scripts/face_daemon.rb
+Restart=on-failure
+RestartSec=30
+
+StandardOutput=journal
+StandardError=journal
+SyslogIdentifier=albumen-faces
+
+[Install]
+WantedBy=multi-user.target
diff --git a/scripts/face_daemon.rb b/scripts/face_daemon.rb
new file mode 100644
index 0000000..5e817cd
--- /dev/null
+++ b/scripts/face_daemon.rb
@@ -0,0 +1,148 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+#
+# Face detection daemon for Albumen.
+#
+# Polls MEDIA_ROOT for images not yet in a per-directory faces.json sidecar
+# and runs faces.py (dlib CNN model) on them.  Never touches album.json —
+# zero write contention with update.rb.
+#
+# faces.json schema (per directory):
+#   filename → null              error during detection; will retry next pass
+#   filename → []                processed, no faces found
+#   filename → [{box,encoding}]  face data
+#   (key absent)                 not yet processed
+#
+# Configuration — from ENV or /opt/albumen/config.yml (under `faces:` key):
+#   workers:        20   # ThreadPoolExecutor workers passed to faces.py
+#   poll_interval: 300   # seconds between full-tree sweeps
+#
+# Signal handling: SIGTERM / SIGINT triggers graceful shutdown between dirs.
+
+require 'json'
+require 'yaml'
+require 'fileutils'
+require 'open3'
+
+MEDIA_ROOT   = (ENV['MEDIA_ROOT']   || '/var/albumen').freeze
+CONFIG_PATH  = (ENV['CONFIG_PATH']  || '/opt/albumen/config.yml').freeze
+LOG_PATH     = (ENV['LOG_PATH']     || '/opt/albumen/log/albumen.log').freeze
+VENV_PYTHON  = (ENV['VENV_PYTHON']  || '/opt/albumen/venv/bin/python3').freeze
+FACES_SCRIPT = (ENV['FACES_SCRIPT'] || '/opt/albumen/scripts/faces.py').freeze
+
+IMAGE_EXTS = %w[jpg jpeg png gif webp heic heif tiff bmp].freeze
+
+_cfg           = File.exist?(CONFIG_PATH) ? (YAML.load_file(CONFIG_PATH, symbolize_names: true) rescue {}) : {}
+FACES_WORKERS  = (_cfg.dig(:faces, :workers)       || 20).to_i.freeze
+POLL_INTERVAL  = (_cfg.dig(:faces, :poll_interval) || 300).to_i.freeze
+
+$shutdown = false
+Signal.trap('TERM') { $shutdown = true }
+Signal.trap('INT')  { $shutdown = true }
+
+# ── Logging ───────────────────────────────────────────────────────────────────
+
+def log(msg)
+  $stdout.puts msg
+  $stdout.flush
+  ts = Time.now.strftime('%Y-%m-%d %H:%M:%S')
+  File.open(LOG_PATH, 'a') { |f| f.puts "[#{ts}] [faces] #{msg}" }
+rescue StandardError
+  # never crash on log failure
+end
+
+# ── faces.json helpers ────────────────────────────────────────────────────────
+
+def load_faces_json(path)
+  return {} unless File.exist?(path)
+  JSON.parse(File.read(path))
+rescue JSON::ParserError
+  {}
+end
+
+def save_faces_atomic(path, data)
+  tmp = "#{path}.tmp.#{Process.pid}"
+  File.write(tmp, JSON.generate(data))
+  File.rename(tmp, path)
+rescue StandardError => e
+  File.unlink(tmp) rescue nil
+  log "  Error saving #{path}: #{e.message}"
+end
+
+# Returns image filenames that still need processing.
+# null-valued entries (prior errors) are retried; [] entries are done.
+def pending_images(dir)
+  faces = load_faces_json(File.join(dir, 'faces.json'))
+  Dir.children(dir)
+     .select { |n| IMAGE_EXTS.include?(File.extname(n).downcase.delete_prefix('.')) }
+     .reject { |n| faces.key?(n) && !faces[n].nil? }
+     .sort
+end
+
+# ── Core processing ───────────────────────────────────────────────────────────
+
+def process_dir(dir)
+  pending = pending_images(dir)
+  return if pending.empty?
+
+  rel   = dir.delete_prefix(MEDIA_ROOT).delete_prefix('/')
+  label = rel.empty? ? '(root)' : rel
+  log "#{label}: #{pending.size} image(s) pending"
+
+  paths = pending.map { |n| File.join(dir, n) }
+  cmd   = [VENV_PYTHON, FACES_SCRIPT, '--workers', FACES_WORKERS.to_s, *paths]
+
+  stdout, stderr, status = Open3.capture3(*cmd)
+
+  unless status.success? || stdout.strip.start_with?('{')
+    log "  faces.py error (exit #{status.exitstatus}): #{stderr.strip}"
+    return
+  end
+
+  begin
+    results = JSON.parse(stdout)
+  rescue JSON::ParserError => e
+    log "  faces.py output is not valid JSON: #{e.message}"
+    return
+  end
+
+  faces_path = File.join(dir, 'faces.json')
+  faces = load_faces_json(faces_path)  # re-read before writing (pick up concurrent changes)
+
+  pending.each do |name|
+    full          = File.join(dir, name)
+    faces[name]   = results[full]
+    detail = faces[name].nil? ? 'error (will retry)' :
+             faces[name].empty? ? 'no faces' :
+             "#{faces[name].length} face(s)"
+    log "  #{name}: #{detail}"
+  end
+
+  save_faces_atomic(faces_path, faces)
+end
+
+def run_pass
+  dirs = [MEDIA_ROOT] + Dir.glob("#{MEDIA_ROOT}/**/*/").sort
+  dirs.each do |dir|
+    return if $shutdown
+    process_dir(dir)
+  end
+end
+
+# ── Main loop ─────────────────────────────────────────────────────────────────
+
+log "Starting (workers=#{FACES_WORKERS}, poll_interval=#{POLL_INTERVAL}s, media=#{MEDIA_ROOT})"
+
+loop do
+  break if $shutdown
+  run_pass
+  break if $shutdown
+
+  # Sleep in 1-second increments so SIGTERM/SIGINT takes effect promptly
+  POLL_INTERVAL.times do
+    break if $shutdown
+    sleep 1
+  end
+end
+
+log 'Shutting down.'
diff --git a/scripts/update.rb b/scripts/update.rb
index 5671330..f909510 100644
--- a/scripts/update.rb
+++ b/scripts/update.rb
@@ -16,6 +16,9 @@
 #   - Safe to re-run at any time; all operations are idempotent.
 #   - Unchanged directories are skipped via a .albumen_scanned sentinel file;
 #     pass --force to bypass.
+#
+# Face detection is NOT handled here. Run face_daemon.rb (or let the systemd
+# service manage it) to detect faces and write per-directory faces.json files.
 
 require 'json'
 require 'yaml'
@@ -23,27 +26,30 @@ require 'fileutils'
 require 'mini_magick'
 require 'mini_exiftool'
 
-MEDIA_ROOT  = (ENV['MEDIA_ROOT']  || '/var/albumen').freeze
-CACHE_ROOT  = (ENV['CACHE_ROOT']  || '/opt/albumen/cache/thumbs').freeze
-CONFIG_PATH = (ENV['CONFIG_PATH'] || '/opt/albumen/config.yml').freeze
-THUMB_SIZE  = 300
+MEDIA_ROOT = (ENV['MEDIA_ROOT'] || '/var/albumen').freeze
+CACHE_ROOT = (ENV['CACHE_ROOT'] || '/opt/albumen/cache/thumbs').freeze
+LOG_PATH   = (ENV['LOG_PATH']   || '/opt/albumen/log/albumen.log').freeze
+THUMB_SIZE = 300
 
 IMAGE_EXTS     = %w[jpg jpeg png gif webp heic heif tiff bmp].freeze
 VIDEO_EXTS     = %w[mp4 mov avi mkv webm m4v ogv].freeze
 AUDIO_EXTS     = %w[mp3 flac ogg wav m4a aac].freeze
 MEDIA_EXTS     = (IMAGE_EXTS + VIDEO_EXTS + AUDIO_EXTS).freeze
-TRANSCODE_EXTS = %w[avi mkv mov].freeze  # not universally browser-playable; convert to MP4
+TRANSCODE_EXTS = %w[avi mkv mov].freeze
 SENTINEL_FILE  = '.albumen_scanned'.freeze
 
-_cfg          = File.exist?(CONFIG_PATH) ? YAML.load_file(CONFIG_PATH, symbolize_names: true) : {}
-FACES_ENABLED = (_cfg.dig(:faces, :enabled) == true).freeze
-FACES_WORKERS = (_cfg.dig(:faces, :workers) || 4).freeze
-VENV_PYTHON   = File.expand_path('../venv/bin/python3', __dir__).freeze
-FACES_SCRIPT  = File.expand_path('faces.py', __dir__).freeze
-
 # Explicit directory argument implies force — you asked for it, it should run.
 FORCE_UPDATE = !!(ARGV.delete('--force') || ARGV[0])
 
+def log(msg)
+  $stdout.puts msg
+  $stdout.flush
+  ts = Time.now.strftime('%Y-%m-%d %H:%M:%S')
+  File.open(LOG_PATH, 'a') { |f| f.puts "[#{ts}] [update] #{msg}" }
+rescue StandardError
+  # never crash on log failure
+end
+
 # ── Directory processing ───────────────────────────────────────────────────────
 
 def process_dir(dir, idx, total)
@@ -51,20 +57,15 @@ def process_dir(dir, idx, total)
   label  = rel.empty? ? '(root)' : rel
   prefix = "[#{idx}/#{total}]"
 
-  pending_faces = false
   unless FORCE_UPDATE
     sentinel = File.join(dir, SENTINEL_FILE)
     if File.exist?(sentinel) && File.mtime(sentinel) >= File.mtime(dir)
-      if faces_pending?(dir)
-        pending_faces = true   # fall through, but only to run face detection
-      else
-        puts "#{prefix} Skipping #{label} (unchanged)"
-        return
-      end
+      log "#{prefix} Skipping #{label} (unchanged)"
+      return
     end
   end
 
-  puts "#{prefix} Scanning #{label}#{' (face detection pending)' if pending_faces}"
+  log "#{prefix} Scanning #{label}"
 
   json_path = File.join(dir, 'album.json')
   data = load_json(json_path)
@@ -84,9 +85,9 @@ def process_dir(dir, idx, total)
     thumb = File.join(CACHE_ROOT, rel.empty? ? "#{n}.th.jpg" : "#{rel}/#{n}.th.jpg")
     if File.exist?(thumb)
       File.unlink(thumb)
-      puts "  Removed: #{n} (+ thumb)"
+      log "  Removed: #{n} (+ thumb)"
     else
-      puts "  Removed: #{n}"
+      log "  Removed: #{n}"
     end
   end
 
@@ -96,20 +97,19 @@ def process_dir(dir, idx, total)
     base   = File.basename(name, '.*')
     target = "#{base}.mp4"
     if current.include?(target)
-      # MP4 already exists — just ensure the marker is recorded
       data['files'][name] ||= {}
       data['files'][name]['transcoded_to'] = target
       next
     end
     full = File.join(dir, name)
     dest = File.join(dir, target)
-    puts "  Transcoding: #{name} → #{target}"
+    log "  Transcoding: #{name} → #{target}"
     transcode_to_mp4(full, dest)
     if File.exist?(dest)
       data['files'][name] ||= {}
       data['files'][name]['transcoded_to'] = target
       current << target
-      puts "    → done"
+      log "    → done"
     else
       warn "  Transcode failed: #{name}"
     end
@@ -132,8 +132,6 @@ def process_dir(dir, idx, total)
     generate_thumb_if_needed(full, rel, name, ext)
   end
 
-  batch_detect_faces(dir, current, data) if FACES_ENABLED
-
   atomic_write_json(json_path, data)
   FileUtils.touch(File.join(dir, SENTINEL_FILE))
 end
@@ -152,7 +150,7 @@ def enrich_image(full, name, meta)
         raw = exif.date_time_original || exif.create_date || exif.date_time
         if raw
           meta['taken_at'] = raw.respond_to?(:strftime) ? raw.strftime('%Y-%m-%dT%H:%M:%S') : raw.to_s
-          puts "  #{name}: taken_at = #{meta['taken_at']}"
+          log "  #{name}: taken_at = #{meta['taken_at']}"
         end
       end
 
@@ -169,14 +167,12 @@ def enrich_image(full, name, meta)
       warn "  #{name}: EXIF error — #{e.message}"
     end
 
-    # If exiftool found nothing at all, record that so we don't retry on every re-scan.
     if meta['taken_at'].nil? && meta['camera'].nil? &&
        meta['aperture'].nil? && meta['shutter'].nil? && meta['iso'].nil?
       meta['exif_absent'] = true
     end
   end
 
-  # Dimensions (skip if already recorded)
   if meta['width'].nil?
     begin
       img = MiniMagick::Image.open(full)
@@ -186,37 +182,6 @@ def enrich_image(full, name, meta)
       warn "  #{name}: dimension error — #{e.message}"
     end
   end
-
-end
-
-def batch_detect_faces(dir, names, data)
-  return unless File.exist?(VENV_PYTHON) && File.exist?(FACES_SCRIPT)
-
-  unprocessed = names.select do |name|
-    IMAGE_EXTS.include?(File.extname(name).downcase.delete_prefix('.')) &&
-      (data['files'][name] || {})['faces'].nil?
-  end
-  return if unprocessed.empty?
-
-  puts "  Detecting faces in #{unprocessed.length} image(s) (#{FACES_WORKERS} workers)…"
-  paths = unprocessed.map { |n| File.join(dir, n) }
-  cmd   = [VENV_PYTHON, FACES_SCRIPT, '--workers', FACES_WORKERS.to_s] + paths
-
-  begin
-    out     = IO.popen(cmd, err: '/dev/null', &:read).strip
-    results = JSON.parse(out.empty? ? '{}' : out)
-    raise 'expected Hash' unless results.is_a?(Hash)
-
-    results.each do |path, faces|
-      name = File.basename(path)
-      next unless data['files'].key?(name)
-      next if faces.nil?   # error on this file — leave faces: null to retry
-      data['files'][name]['faces'] = faces
-      puts "  #{name}: #{faces.length} face(s)" unless faces.empty?
-    end
-  rescue StandardError => e
-    warn "  Face detection batch error — #{e.message}"
-  end
 end
 
 def enrich_video(full, name, meta)
@@ -232,12 +197,12 @@ end
 # ── Thumbnail generation ───────────────────────────────────────────────────────
 
 def generate_thumb_if_needed(full, rel, name, ext)
-  return if AUDIO_EXTS.include?(ext)   # audio uses a static icon
+  return if AUDIO_EXTS.include?(ext)
 
   cache = File.join(CACHE_ROOT, rel.empty? ? "#{name}.th.jpg" : "#{rel}/#{name}.th.jpg")
   return if File.exist?(cache)
 
-  puts "  Generating thumb: #{name}"
+  log "  Generating thumb: #{name}"
   FileUtils.mkdir_p(File.dirname(cache))
 
   if VIDEO_EXTS.include?(ext)
@@ -295,16 +260,6 @@ rescue JSON::ParserError => e
   {}
 end
 
-def faces_pending?(dir)
-  return false unless FACES_ENABLED
-  json_path = File.join(dir, 'album.json')
-  return false unless File.exist?(json_path)
-  (load_json(json_path)['files'] || {}).any? do |name, meta|
-    IMAGE_EXTS.include?(File.extname(name).downcase.delete_prefix('.')) &&
-      meta['faces'].nil?
-  end
-end
-
 # Fields the admin controls — never overwrite with stale values from our earlier read.
 ADMIN_ALBUM_KEYS = %w[title description cover cover_dynamic sort_reverse visible].freeze
 ADMIN_FILE_KEYS  = %w[title caption visible].freeze
@@ -348,7 +303,7 @@ if Process.uid == 0
   begin
     require 'etc'
     pw = Etc.getpwnam(service_user)
-    puts "Fixing ownership of #{start} → #{service_user}"
+    log "Fixing ownership of #{start} → #{service_user}"
     FileUtils.chown_R(pw.uid, pw.gid, start)
   rescue ArgumentError
     warn "Warning: user '#{service_user}' not found; skipping chown"
@@ -361,4 +316,4 @@ dirs  = dirs.uniq
 total = dirs.size
 dirs.each_with_index { |d, i| process_dir(d, i + 1, total) }
 
-puts 'Done.'
+log 'Done.'
author	Ken D'Ambrosio <ken@jots.org>	2026-06-08 18:36:07 +0000
committer	Ken D'Ambrosio <ken@jots.org>	2026-06-08 18:36:07 +0000
commit	625b3d5176f2c274e91fcf28bda8e45cc0477722 (patch)
tree	6ca16ad6f4a830b65dcddbd78ad7e7a2f1655682
parent	ecc872a1fd43c0863e3171a1faf533adc3e3a4c5 (diff)