Step 1: Purpose and Scope

This chapter is the formal continuation of the case study and the first page after Step 0 in the core documentation path. The case study explains the story; this page defines the operating intent, boundaries, and acceptance criteria.

Back to case study Back to Step 0 Continue to step 2: Workflow

1.1 Operational baseline from the case study

The documented baseline is approximately 32,000 images handled before automation and close to 10,000 images processed through automation since mid-2025. The practical goal is not only speed; it is repeatable quality with safer failure recovery.

Manual naming was often generic and inconsistent by subject context.
Manual corrections were expensive because multiple files and fields had to be edited by hand.
Batch processing existed, but quality control and rollback behavior were weak.
The automation target is reliable 100-image uploads with cleaner naming, captions, and keywords.

1.2 Purpose statement

Amir2000 Image Automation exists to convert raw image sets into reviewed, publish-ready assets with deterministic naming, structured metadata, and explicit lifecycle states. It is a local-first production workflow with human approval as a hard gate.

1.3 What is in scope

Import one or more operator-selected sets into a staged processing run.
Extract and normalize technical metadata in a repeatable pipeline.
Create and maintain queue records in data/review.db (review_queue).
Generate local AI prefill for caption, alt text, and keywords using Ollama.
Apply deterministic filename generation with duplicate prevention controls.
Require human review decisions before any publish action.
Publish approved rows through FTP upload + MySQL upsert and local mirror sync.
Provide rollback and cleanup behavior when a run fails before completion.

1.4 What is intentionally out of scope

Multi-user collaborative DAM features and role-based workflow orchestration.
Fully autonomous publish with no human quality gate.
Cloud-only inference dependency for core processing.
Replacing the existing website information model outside required publish integration points.

1.5 System boundaries and dependencies

Internal responsibility:

Pipeline orchestration, intermediate artifacts, decision-state tracking, and publish handoff logic.
Filename reservation control via data/used_filenames.json.
Operational and diagnostic logs such as logs/latest_run.log and logs/db_uploader.log.

External dependencies:

Ollama model availability for caption/keyword prefill quality.
FTP connectivity and credential validity for remote file publish.
MySQL connectivity and table contract compatibility for metadata upsert.

Boundary rule: pre-publish stages run locally; remote systems are touched only in the approved publish path.

1.6 Primary outputs and ownership model

Queue truth: staged and reviewed rows in data/review.db table review_queue.
Model staging artifacts: temporary resized inputs under data/ollama_tmp/run_*.
Published remote assets: approved image files written to target FTP paths.
Published metadata: upserted records in MySQL table photos_info_revamp.
Local publish mirror: synchronized IDs and records in data/photos_info_revamp.db.

Ownership principle: database row state is the source of truth for lifecycle decisions, not ad-hoc file inspection.

1.7 Definition of success (acceptance criteria)

Each run is traceable by log output and queue state transitions from import to review to publish.
No duplicate File_Name collisions are introduced across successful runs.
Approved-only publish is enforced; rejected rows are not uploaded or upserted.
When blocking errors occur, rollback restores files and releases reserved names for safe retry.
Output naming and captions are more specific and cleaner than manual generic baseline patterns.
Operationally, high-volume batches are manageable with less manual correction overhead.

1.8 Risk posture at this scope

Human review is a deliberate throughput tradeoff to reduce publish risk.
Local inference quality varies by hardware and model selection, so output is assistive, not authoritative.
Publish reliability depends on external FTP/MySQL availability and contract stability.

1.9 Continuation path

Step 1 defines why and where the system is bounded. Step 2 documents exactly how it executes at runtime.

Next: Step 2 Workflow Back to Step 0