Supporting Reference 3: Troubleshooting

Use this page when a run fails or quality output is unexpectedly weak. It is optimized for fast diagnosis and safe recovery.

R3.1 First-response triage order

  1. Read dist/logs/latest_run.log for EXE runs (or logs/latest_run.log for Python runs) and locate the first blocking error.
  2. If publish is involved, read logs/db_uploader.log for first failing row.
  3. Check row state in review_queue by Review_Status/QC_Status.
  4. Fix root cause before retrying (model/dependency/path/credentials).
  5. Rerun safely through normal flow; do not patch ad-hoc state blindly.

R3.2 Caption/prefill failures (Ollama)

Symptoms: caption stage stalls, timeout errors, or missing model exceptions.

Checks:

Fix path:

  1. Pull missing model with ollama pull <model>.
  2. Restart/retry once after service health is restored.
  3. If repeated, reduce parallel load and re-run controlled batch.

R3.3 Quality scoring failures

Symptoms: scoring stage crash, missing metric fields, or repeated stage abort.

Checks:

Fix path:

  1. Rebuild or repair venv if imports fail.
  2. If error mentions python312.dll conflict, relaunch EXE with AMIR_PYTHON set to the 3.13 interpreter.
  3. Restore missing model weight file.
  4. Retry the run from start after environment is healthy.

R3.4 Resize failures (QC_Status=ResizeFailed)

Symptoms: rows flagged as ResizeFailed and caption prefill skipped.

Checks:

Fix path: correct filesystem constraints, then rerun the affected set through normal pipeline entry.

R3.5 Publish failures (FTP/MySQL)

Symptoms: uploader errors, partial remote updates, or missing published records.

Checks:

Fix path:

  1. Resolve auth/connectivity issue.
  2. Retry publish for approved rows.
  3. Verify upsert completion by File_Name in MySQL table photos_info_revamp.

R3.6 SQLite lock or inconsistency

R3.7 Filename collision issues

R3.8 App crash or forced close (continue safely)

Symptoms: app window closes unexpectedly, or Windows Event Viewer shows native crash events (for example 0xc0000005 / BEX64).

Safe continue path:

  1. Restart the app.
  2. Click Recover crash session next to Clear all.
  3. Verify recovered queue size, then continue with Start Batch.

Recovery file: data/multiset_session.json. Optional backups: data/multiset_session.backup_*.json.

If crash happened while clicking Add set, inspect data/crash_runtime.log first. For high-volume queue building, you can disable automatic subject generation with AUTO_AI_SUBJECT_ON_SELECT=0 and run AI suggest manually per set.

R3.9 When to escalate to runbook flow

Escalation path: Step 3: Runbook.

R3.10 Keyword context pollution (geo mismatch)

Symptoms: unrelated location keywords appear (for example usa, colorado) in non-USA images.

Checks:

Fix path:

  1. Remove unrelated keywords in review editor for current run.
  2. Keep keyword generation location-aware and avoid hardcoded geography pools.
  3. For already-published rows, run targeted MySQL cleanup by run scope before republish.

For the 2026-02-18 affected 321-row publish batch, use: data/mysql_cleanup_exact_run321_20260218.sql from the automation project root.

R3.11 Semantic drift (terrain/time-of-day mismatch)

Symptoms: captions/alt/keywords conflict with obvious set context, for example NL/polder images getting mountain terms or sunrise filenames getting sunset text.

Checks:

Fix path:

  1. For current queue rows, edit conflicting metadata directly in review editor and continue.
  2. For recurrent drift, confirm current build includes context guardrails in caption_review_local.py and rebuild the EXE.
  3. Re-run only affected queued rows where needed.

R3.12 Continuation path

© 2026 Amir Darzi
Privacy Policy  |  Photography site | W3C-Valid  |  Cookie settings