Step 2: Workflow
Define how automation executes in production, from set validation through review and publish.
Step 1 defined purpose and boundaries; Step 2 defines runtime behavior.
2.1 End-to-end execution model
Run a staged desktop pipeline orchestrated in main_set.py. The design lets operators process
high-volume sets while maintaining strict publish control through a mandatory review gate.
- Prepare files and metadata in stages 1 to 7 so rows enter review-ready state.
- Open the review editor in stage 8 so human decisions can be applied.
- Publish only approved rows through the FTP + MySQL sync path.
- Trigger rollback on blocking failures so runs can be retried safely.
Workflow figure: high-level stage flow from import through publish.
2.2 Stage sequence in main_set.py
- Validate sets - checks selected set inputs before any write operations.
- Prepare DB and copy to incoming - establishes working context and staged local inputs.
- Extract EXIF and initial metadata - seeds technical metadata used downstream.
- Insert or refresh review rows - creates or updates queue rows in
review_queue.
- AI quality scoring - computes quality signals for later review decisions and persists all score fields (including
brisque_score and clip_aesthetic_score) with deterministic fallback values when needed.
- Resize images for Ollama - writes temporary model-ready files and tracks
ollama_path.
- Caption and keywords prefill - uses a primary model with optional row-level fallback retries (when fallback is configured), classifier-assisted nature hints (when enabled), wildlife species-aware consistency checks without maintaining a fixed species list, strict duplicate controls, evidence-first keyword guardrails, filename/time-of-day consistency checks, NL lowland anti-mountain filtering, malformed phrase rejection (for example generic
open terrain/scene view of scene artifacts), and sentence cleanup for readable output.
- Open review editor - transitions into manual review and approval workflow.
Startup runtime check in the stage log: model processor mode, context, and VRAM are printed before execution continues.
2.3 Runtime data flow
- Normalize source files into staging paths for deterministic processing.
- Store operational source-of-truth records in
data/review.db, table review_queue.
- Stage helper scripts into
data/_runtime_scripts during EXE runs to avoid frozen-runtime DLL conflicts.
- Read
ollama_path (fallback Path) per row during caption generation.
- Run caption prefill with row-level retry semantics: optional fallback model is used only when configured and when the primary model fails that row.
- Run a post-prefill QC scan for duplicates/suspicious text and write report output to
data/prefill_qc_last.json before review editor opens.
- Use Stage 6 watchdog guards: idle/hard timeout protection terminates stuck child workers, retries are automatic, and repeated native crash rows are quarantined as
Review_Status=Error and QC_Status=PrefillNativeCrash so the batch can continue.
- Enforce evidence-only keyword behavior for sensitive geography and terrain terms (no forced
usa/colorado or filler terrain terms without context).
- Apply context guardrails to keep metadata aligned with set cues: NL/polder runs suppress mountain-alpine drift, and filename cues such as
sunrise/sunset constrain time-of-day wording.
- Use folder-aware fallback templates (for example people/cityscape/macro/night/aviation) so weak model rows still produce reviewable text aligned with set category context.
- Apply sentence normalization and style rejection to reduce malformed phrasing (for example broken hyphen compounds or malformed starts in generated caption/alt text).
- Write final text fields, quality decisions, and publish intent from editor updates.
- Sync approved rows to remote assets and DB while updating local mirror state.
2.4 Review and publish path after stage 8
- Operator validates and edits caption, alt text, keywords, and filename fields (with shared dictionary spellcheck available in the text fields).
- Operator can trigger row-level regeneration via Generate; regenerate flow checks pending-row duplicates before persisting updated metadata.
- Each row receives a decision status (approved or rejected).
- Publish action targets approved rows only.
- Uploader sends full image and thumbnail outputs to configured FTP destinations.
- MySQL upsert is applied using
File_Name as stable key.
- Local mirror IDs and publish state are synchronized after remote write success.
- Publish completion uses one final confirmation dialog; clicking OK closes the review window.
- Temporary model artifacts are cleaned when publish completes without blocking errors.
2.5 Queue statuses and transition intent
Primary fields:
QC_Status - quality class from automated scoring and review updates.
Review_Status - lifecycle decision such as Pending, Approved, Rejected, Published, or Error.
Transition rule: no row enters publish state without explicit review approval.
2.6 Failure handling and rollback behavior
Handle failures as session events, not isolated row edits. When a blocking error occurs,
rollback logic restores operational consistency so a clean rerun is possible.
- Remove inserted session rows from the active review queue when required.
- Restore moved or staged files to original folders.
- Release reserved filenames from
data/used_filenames.json.
- Preserve failure details in logs for diagnosis before rerun.
Rollback figure: blocking caption-model error followed by automated session cleanup.
2.7 Run artifacts and logs
- Pipeline log:
logs/latest_run.log (or beside EXE in frozen builds).
- Startup runtime line in pipeline log:
[INFO] Ollama startup check: model=... processor=GPU/CPU context=... vram=....
- Uploader log:
logs/db_uploader.log.
- Runtime callback crash log:
data/crash_runtime.log.
- Prefill QC report:
data/prefill_qc_last.json.
- Run summary:
data/run_log.txt.
- Frozen helper staging path:
data/_runtime_scripts/.
- Temporary model files:
data/ollama_tmp/run_*.
Prefill QC summary output used before entering review editor.
2.8 Continuation path
Step 2 describes execution behavior. Step 3 defines daily operation, checklists, and recovery procedure for production use.