Step 5: Developer Guide

This chapter closes the core path with implementation guardrails for safe evolution. The goal is to ship improvements without regressing reliability, review safety, or publish integrity.

Back to step 4 Back to Step 0 Continue with supporting refs

5.1 Developer objective

Extend the automation incrementally while preserving deterministic filenames, explicit queue state transitions, review-first publish behavior, and recoverable failure handling.

5.2 Non-negotiable guardrails

Keep review_queue as operational source of truth before publish.
Preserve review-first policy: no direct publish from unreviewed AI output.
Do not hardcode geography keyword pools; location terms must come from row context and taxonomy.
Preserve deterministic filename reservation in data/used_filenames.json.
Do not bypass uploader semantics: publish uses approved rows only.
Treat File_Name as business key for publish upsert and mirror sync.
Prefer explicit DB status and logs over implicit folder-based assumptions.

5.3 Module responsibility map

main_set.py - stage orchestration, set lifecycle, session rollback boundaries, and frozen-runtime helper staging under data/_runtime_scripts.
batch_image_quality_score.py - metric computation and quality-class assignment inputs.
caption_review_local.py - caption/keyword generation, dedupe strategy, weak-output handling.
review_editor.py - human decision layer, row-level edits, approval and rejection semantics.
db_uploader.py - FTP upload, MySQL upsert by File_Name, local mirror synchronization.
init_db.py - SQLite bootstrap from data/init/*.sql.
amir2000_config.py - runtime paths, publish endpoints, credentials, model settings.

5.4 Safe change patterns

A. Add or modify a metadata field

Update schema in both data/init/*.sql and local docs copies (docs/review_queue.sql, docs/photos_info_revamp.sql) as needed.
Update producers (stage scripts) and consumers (review editor, uploader, diagnostics).
Verify DB bootstrap via python .\init_db.py and run a smoke batch.
Update Step 4 database documentation if semantics changed.

B. Adjust quality scoring behavior

Change metric/threshold logic in batch_image_quality_score.py.
Validate impact on QC_Status distribution and review effort.
Run controlled regression sets before applying to large batches.

C. Change caption/prefill behavior

Modify caption_review_local.py with anti-duplication behavior preserved.
Keep keyword context filters strict (subject/location/folder aligned) to prevent unrelated geo terms.
Preserve deterministic context guardrails (for example NL lowland anti-mountain filtering and filename sunrise/sunset time-of-day consistency).
Confirm fallback behavior for Ollama latency or missing model states.
Verify improved specificity without overfitting repeated templates.

D. Change publish behavior

Keep approved-only selection logic intact.
Preserve MySQL upsert-by-File_Name and mirror ID synchronization.
Re-verify that successful uploads clear queue rows as expected.

E. Change frozen EXE subprocess behavior

Keep helper-script staging via data/_runtime_scripts for stage scripts launched from EXE.
Preserve subprocess environment scrub logic to avoid interpreter/DLL mismatch.
Validate stage logs still print scoring runtime interpreter and version.

5.5 Data and schema compatibility rules

Keep runtime schema assumptions aligned with init SQL and documentation copies.
Prefer additive, backward-compatible changes; avoid destructive migrations.
Validate behavior with existing DB files before considering reset requirements.
When reset is unavoidable, document migration/reset impact clearly in runbook notes.

5.6 Failure, rollback, and idempotency expectations

Blocking stage failures must not leave irreversible partial state.
Rollback must continue restoring staged files and releasing reserved filenames when applicable.
Publish retries should stay idempotent via upsert semantics keyed by File_Name.
Diagnostic visibility in logs/latest_run.log and logs/db_uploader.log must remain clear.

5.7 Verification checklist before merge

Run a small end-to-end source execution from import to review and publish.
Verify approve/reject transitions and filename reservation behavior.
Validate EXE run on Python 3.13 and confirm no python312.dll conflict during scoring.
Inject at least one known failure path (for example model missing) and verify recovery.
Confirm MySQL upsert and local mirror synchronization remain consistent.
Validate no critical regressions in logs across pipeline and uploader.

5.8 Release and packaging path

Set-Location "\path\to\amir2000_image_automation"
pwsh -NoProfile -ExecutionPolicy Bypass -File .\helpers\preflight_multiset.ps1
pwsh -NoProfile -ExecutionPolicy Bypass -File .\helpers\build_multiset.ps1 -Clean -BuildProfile Lite

Use sanitized export tooling before sharing public repository snapshots.
Keep packaging artifacts separate from runtime DB/log state when distributing builds.
Document model/version assumptions for reproducible operator behavior.

5.9 Anti-patterns to avoid

Direct manual DB edits in production without matching code/schema updates.
Bypassing review editor and writing publish rows directly from AI output.
Resetting or editing used_filenames.json without collision checks.
Shipping behavior changes without exercising failure and retry scenarios.

5.10 Core path completion

Core documentation is complete at this step (Step 0 to Step 5). Continue with supporting references based on task context.

Return to Step 0 overview Next reference: Install Open troubleshooting

Back to case study Back to step 1