Step 5: Developer Guide
This chapter closes the core path with implementation guardrails for safe evolution.
The goal is to ship improvements without regressing reliability, review safety, or publish integrity.
5.1 Developer objective
Extend the automation incrementally while preserving deterministic filenames, explicit queue state transitions,
review-first publish behavior, and recoverable failure handling.
5.2 Non-negotiable guardrails
- Keep
review_queue as operational source of truth before publish.
- Preserve review-first policy: no direct publish from unreviewed AI output.
- Do not hardcode geography keyword pools; location terms must come from row context and taxonomy.
- Preserve deterministic filename reservation in
data/used_filenames.json.
- Do not bypass uploader semantics: publish uses approved rows only.
- Treat
File_Name as business key for publish upsert and mirror sync.
- Prefer explicit DB status and logs over implicit folder-based assumptions.
5.3 Module responsibility map
main_set.py - stage orchestration, set lifecycle, session rollback boundaries, and frozen-runtime helper staging under data/_runtime_scripts.
batch_image_quality_score.py - metric computation and quality-class assignment inputs.
caption_review_local.py - caption/keyword generation, dedupe strategy, weak-output handling.
review_editor.py - human decision layer, row-level edits, approval and rejection semantics.
db_uploader.py - FTP upload, MySQL upsert by File_Name, local mirror synchronization.
init_db.py - SQLite bootstrap from data/init/*.sql.
amir2000_config.py - runtime paths, publish endpoints, credentials, model settings.
5.4 Safe change patterns
A. Add or modify a metadata field
- Update schema in both
data/init/*.sql and local docs copies (docs/review_queue.sql, docs/photos_info_revamp.sql) as needed.
- Update producers (stage scripts) and consumers (review editor, uploader, diagnostics).
- Verify DB bootstrap via
python .\init_db.py and run a smoke batch.
- Update Step 4 database documentation if semantics changed.
B. Adjust quality scoring behavior
- Change metric/threshold logic in
batch_image_quality_score.py.
- Validate impact on
QC_Status distribution and review effort.
- Run controlled regression sets before applying to large batches.
C. Change caption/prefill behavior
- Modify
caption_review_local.py with anti-duplication behavior preserved.
- Keep keyword context filters strict (subject/location/folder aligned) to prevent unrelated geo terms.
- Preserve deterministic context guardrails (for example NL lowland anti-mountain filtering and filename
sunrise/sunset time-of-day consistency).
- Confirm fallback behavior for Ollama latency or missing model states.
- Verify improved specificity without overfitting repeated templates.
D. Change publish behavior
- Keep approved-only selection logic intact.
- Preserve MySQL upsert-by-
File_Name and mirror ID synchronization.
- Re-verify that successful uploads clear queue rows as expected.
E. Change frozen EXE subprocess behavior
- Keep helper-script staging via
data/_runtime_scripts for stage scripts launched from EXE.
- Preserve subprocess environment scrub logic to avoid interpreter/DLL mismatch.
- Validate stage logs still print scoring runtime interpreter and version.
5.5 Data and schema compatibility rules
- Keep runtime schema assumptions aligned with init SQL and documentation copies.
- Prefer additive, backward-compatible changes; avoid destructive migrations.
- Validate behavior with existing DB files before considering reset requirements.
- When reset is unavoidable, document migration/reset impact clearly in runbook notes.
5.6 Failure, rollback, and idempotency expectations
- Blocking stage failures must not leave irreversible partial state.
- Rollback must continue restoring staged files and releasing reserved filenames when applicable.
- Publish retries should stay idempotent via upsert semantics keyed by
File_Name.
- Diagnostic visibility in
logs/latest_run.log and logs/db_uploader.log must remain clear.
5.7 Verification checklist before merge
- Run a small end-to-end source execution from import to review and publish.
- Verify approve/reject transitions and filename reservation behavior.
- Validate EXE run on Python 3.13 and confirm no
python312.dll conflict during scoring.
- Inject at least one known failure path (for example model missing) and verify recovery.
- Confirm MySQL upsert and local mirror synchronization remain consistent.
- Validate no critical regressions in logs across pipeline and uploader.
5.8 Release and packaging path
Set-Location "\path\to\amir2000_image_automation"
pwsh -NoProfile -ExecutionPolicy Bypass -File .\helpers\preflight_multiset.ps1
pwsh -NoProfile -ExecutionPolicy Bypass -File .\helpers\build_multiset.ps1 -Clean -BuildProfile Lite
- Use sanitized export tooling before sharing public repository snapshots.
- Keep packaging artifacts separate from runtime DB/log state when distributing builds.
- Document model/version assumptions for reproducible operator behavior.
5.9 Anti-patterns to avoid
- Direct manual DB edits in production without matching code/schema updates.
- Bypassing review editor and writing publish rows directly from AI output.
- Resetting or editing
used_filenames.json without collision checks.
- Shipping behavior changes without exercising failure and retry scenarios.
5.10 Core path completion
Core documentation is complete at this step (Step 0 to Step 5). Continue with supporting references based on task context.