Supporting Reference 4: Logging and Failure Handling

This page defines where failures are recorded and how to interpret logs during operations and incident response.

R4.1 Logging design rule

Failures must be explicit, traceable, and recoverable. The system favors visible stage errors and DB state over silent partial success.

R4.2 Primary log surfaces

R4.3 Typical failure domains

R4.4 State-to-log correlation

Log analysis should be correlated with queue state in review_queue.

R4.5 Triage workflow

  1. Identify first fatal/error in logs/latest_run.log.
  2. If publish path involved, inspect first failing row in logs/db_uploader.log.
  3. Query queue rows to estimate impact scope.
  4. Apply targeted fix and re-run through standard workflow.
  5. If partial state is uncertain, follow rollback-aware runbook path.

R4.6 Retention and review habits

R4.7 Continuation path

© 2026 Amir Darzi
Privacy Policy  |  Photography site | W3C-Valid  |  Cookie settings