Step 1: Purpose and Scope
This chapter is the formal continuation of the case study and the first page after Step 0 in the core documentation path.
The case study explains the story; this page defines the operating intent, boundaries, and acceptance criteria.
1.1 Operational baseline from the case study
The documented baseline is approximately 32,000 images handled before automation and close to 10,000 images processed
through automation since mid-2025. The practical goal is not only speed; it is repeatable quality with safer failure recovery.
- Manual naming was often generic and inconsistent by subject context.
- Manual corrections were expensive because multiple files and fields had to be edited by hand.
- Batch processing existed, but quality control and rollback behavior were weak.
- The automation target is reliable 100-image uploads with cleaner naming, captions, and keywords.
1.2 Purpose statement
Amir2000 Image Automation exists to convert raw image sets into reviewed, publish-ready assets with deterministic naming,
structured metadata, and explicit lifecycle states. It is a local-first production workflow with human approval as a hard gate.
1.3 What is in scope
- Import one or more operator-selected sets into a staged processing run.
- Extract and normalize technical metadata in a repeatable pipeline.
- Create and maintain queue records in
data/review.db (review_queue).
- Generate local AI prefill for caption, alt text, and keywords using Ollama.
- Apply deterministic filename generation with duplicate prevention controls.
- Require human review decisions before any publish action.
- Publish approved rows through FTP upload + MySQL upsert and local mirror sync.
- Provide rollback and cleanup behavior when a run fails before completion.
1.4 What is intentionally out of scope
- Multi-user collaborative DAM features and role-based workflow orchestration.
- Fully autonomous publish with no human quality gate.
- Cloud-only inference dependency for core processing.
- Replacing the existing website information model outside required publish integration points.
1.5 System boundaries and dependencies
Internal responsibility:
- Pipeline orchestration, intermediate artifacts, decision-state tracking, and publish handoff logic.
- Filename reservation control via
data/used_filenames.json.
- Operational and diagnostic logs such as
logs/latest_run.log and logs/db_uploader.log.
External dependencies:
- Ollama model availability for caption/keyword prefill quality.
- FTP connectivity and credential validity for remote file publish.
- MySQL connectivity and table contract compatibility for metadata upsert.
Boundary rule: pre-publish stages run locally; remote systems are touched only in the approved publish path.
1.6 Primary outputs and ownership model
- Queue truth: staged and reviewed rows in
data/review.db table review_queue.
- Model staging artifacts: temporary resized inputs under
data/ollama_tmp/run_*.
- Published remote assets: approved image files written to target FTP paths.
- Published metadata: upserted records in MySQL table
photos_info_revamp.
- Local publish mirror: synchronized IDs and records in
data/photos_info_revamp.db.
Ownership principle: database row state is the source of truth for lifecycle decisions, not ad-hoc file inspection.
1.7 Definition of success (acceptance criteria)
- Each run is traceable by log output and queue state transitions from import to review to publish.
- No duplicate
File_Name collisions are introduced across successful runs.
- Approved-only publish is enforced; rejected rows are not uploaded or upserted.
- When blocking errors occur, rollback restores files and releases reserved names for safe retry.
- Output naming and captions are more specific and cleaner than manual generic baseline patterns.
- Operationally, high-volume batches are manageable with less manual correction overhead.
1.8 Risk posture at this scope
- Human review is a deliberate throughput tradeoff to reduce publish risk.
- Local inference quality varies by hardware and model selection, so output is assistive, not authoritative.
- Publish reliability depends on external FTP/MySQL availability and contract stability.