Supporting Reference 3: Troubleshooting

Use this page when a run fails or quality output is unexpectedly weak. It is optimized for fast diagnosis and safe recovery.

R3.1 First-response triage order

  1. Read dist/logs/latest_run.log for EXE runs (or logs/latest_run.log for Python runs) and locate the first blocking error.
  2. If publish is involved, read logs/db_uploader.log for first failing row.
  3. Check row state in review_queue by Review_Status/QC_Status.
  4. Fix root cause before retrying (model/dependency/path/credentials).
  5. Rerun safely through normal flow; do not patch ad-hoc state blindly.

R3.2 Caption/prefill failures (Ollama)

Symptoms: caption stage stalls, timeout errors, or missing model exceptions.

Checks:

Fix path:

  1. Pull missing model with ollama pull <model>.
  2. If expected GPU mode is not used, set/verify: OLLAMA_FORCE_GPU=1, OLLAMA_RESTART_FOR_GPU=1, OLLAMA_LLM_LIBRARY=cuda, OLLAMA_NUM_GPU, OLLAMA_MAIN_GPU.
  3. Restart/retry once after service health is restored.
  4. If the same row repeatedly fails with quarantine markers (PrefillNativeCrash), inspect row-specific failure reason first instead of re-running full batches blindly.
  5. If repeated, reduce parallel load and re-run controlled batch.

R3.3 Quality scoring failures

Symptoms: scoring stage crash, missing metric fields, or repeated stage abort.

Checks:

Fix path:

  1. Rebuild or repair venv if imports fail.
  2. If error mentions python312.dll conflict, relaunch EXE with AMIR_PYTHON set to the 3.13 interpreter.
  3. Restore missing model weight file.
  4. Retry the run from start after environment is healthy.
-- quick completeness check
SELECT
  SUM(nima_score IS NULL) AS nima_null,
  SUM(blur_score IS NULL) AS blur_null,
  SUM(brightness_score IS NULL) AS brightness_null,
  SUM(contrast_score IS NULL) AS contrast_null,
  SUM(brisque_score IS NULL) AS brisque_null,
  SUM(clip_aesthetic_score IS NULL) AS clip_null,
  SUM(QR IS NULL) AS qr_null
FROM photos_info_revamp;

R3.4 Resize failures (QC_Status=ResizeFailed)

Symptoms: rows flagged as ResizeFailed and caption prefill skipped.

Checks:

Fix path: correct filesystem constraints, then rerun the affected set through normal pipeline entry.

R3.5 Publish failures (FTP/MySQL)

Symptoms: uploader errors, partial remote updates, or missing published records.

Checks:

Fix path:

  1. Resolve auth/connectivity issue.
  2. Retry publish for approved rows.
  3. Verify upsert completion by File_Name in MySQL table photos_info_revamp.

R3.6 SQLite lock or inconsistency

R3.7 Filename collision issues

R3.8 App crash or forced close (continue safely)

Symptoms: app window closes unexpectedly, or Windows Event Viewer shows native crash events (for example 0xc0000005 / BEX64).

Safe continue path:

  1. Restart the app.
  2. Click Recover crash session next to Clear all.
  3. Verify recovered queue size, then continue with Start Batch.

Recovery file: data/multiset_session.json. Optional backups: data/multiset_session.backup_*.json. Recovery checks both incoming and staged file locations before restore.

If crash happened while clicking Add set, inspect data/crash_runtime.log first. For high-volume queue building, you can disable automatic subject generation with AUTO_AI_SUBJECT_ON_SELECT=0 and run AI suggest manually per set.

R3.9 When to escalate to runbook flow

Escalation path: Step 3: Runbook.

R3.10 Keyword context pollution (geo mismatch)

Symptoms: unrelated location keywords appear (for example usa, colorado) in non-USA images.

Checks:

Fix path:

  1. Remove unrelated keywords in review editor for current run.
  2. Keep keyword generation location-aware and avoid hardcoded geography pools.
  3. For already-published rows, run targeted MySQL cleanup by run scope before republish.

For affected publish batches, use the matching run-scoped cleanup SQL script from the automation project root (for example data/mysql_cleanup_exact_run*.sql).

R3.10A Generate button issues in review editor

Symptoms: regenerate fails, or regenerated text repeats pending-row captions.

Checks:

Fix path:

  1. Repair missing dependency/runtime issue first (then retry Generate).
  2. If regenerate still fails, inspect logs/latest_run.log and retry after confirming the primary model/runtime path.
  3. Keep manual review as final gate; do not publish regenerated text without validation.

R3.11 Semantic drift (terrain/time-of-day mismatch)

Symptoms: captions/alt/keywords conflict with obvious set context, for example NL/polder images getting mountain terms, invented rocky/ridge terrain in Amsterdam runs, or sunrise filenames getting sunset text.

Checks:

Fix path:

  1. For current queue rows, edit conflicting metadata directly in review editor and continue.
  2. For recurrent drift, confirm current build includes context guardrails in caption_review_local.py and rebuild the EXE.
  3. Confirm fallback routing for the affected Folder category is producing category-specific wording (not generic scene placeholders).
  4. Re-run only affected queued rows where needed.

R3.12 Text quality artifacts (hyphen/grammar)

Symptoms: malformed fragments such as City- view, view of along, article-verb starts like An crosses ..., or generic filler lines like scene appears in open terrain / scene view of scene.

Checks:

Fix path:

  1. Update to the latest build with sentence cleanup and malformed-start rejection.
  2. Reject or rewrite generic filler phrases; final output should describe visible evidence only.
  3. Re-run only affected queued rows and validate output quality before publish.

R3.13 Continuation path

© 2026 Amir Darzi
Privacy Policy  |  Photography site | W3C-Valid  |  Cookie settings