Supporting Reference 6: File Reference

This index intentionally lists only files that belong to the GitHub repository. Local-only backup/sanitizer artifacts are excluded from this reference.

Back to Step 0 Back to reference 5 Next reference: Design principles

R6.1 Core runtime files (GitHub)

main_set.py - orchestrator UI, stage sequencing, and session flow.
batch_image_quality_score.py - quality feature computation and score writes.
caption_review_local.py - local caption/alt/keyword generation logic.
review_editor.py - manual decision UI and row edits.
db_uploader.py - FTP upload + MySQL upsert + mirror sync.
init_db.py - local DB initialization from SQL contracts.
metadata_evidence_pipeline.py - shared metadata evidence/proof pipeline entry used by V1.0 quality checks.
run_metadata_quality_production.ps1 - PowerShell wrapper for the production metadata-quality proof stage.
simple_inference.py - local inference helper used by runtime workflows.
amir2000_config.py - environment/config contract.

R6.2 Contract and data files (GitHub)

data/init/review_queue.sql - queue schema bootstrap.
data/init/photos_info_revamp.sql - mirror schema bootstrap.
data/init/metadata_quality.sql - metadata proof, repair audit, and ML feedback bootstrap.
data/used_filenames.json - filename reservation ledger.
data/folder_map.json - folder taxonomy mapping.
data/location_list.json - location taxonomy list.
data/autofix_dict.json - token normalization dictionary.
data/spellcheck_exceptions.json - spellcheck allowlist for valid domain terms.
data/new_taxonomy_log.json - append log of taxonomy additions from UI flows.
data/ui_state.json - persisted UI/editor runtime state.
data/prefill_qc_last.json - latest prefill duplicate/suspicious QC report.

R6.3 V1.0 identifier and metadata-quality scripts (GitHub)

scripts/apply_identifier_router_result_to_db.py - applies identifier router output back to SQLite rows.
scripts/download_identifier_models.py - prepares optional local identifier model assets.
scripts/evidence_subject_pipeline.py - evidence-first subject/metadata helper pipeline.
scripts/identifier_biology_bioclip.py - local biology identifier route.
scripts/identifier_biology_inaturalist.py - optional biology reference route for taxon candidate support.
scripts/identifier_biology_runner.py - production biology runner used by the identifier router.
scripts/identifier_consensus.py - combines identifier candidates into a safer consensus result.
scripts/identifier_db_setup.py - initializes identifier support tables/assets.
scripts/identifier_general_vision.py - general visual identifier route for non-specialized subjects.
scripts/identifier_router.py - routes a row or set to the appropriate identifier strategy.
scripts/identifier_vehicle_aircraft.py - specialized route for vehicle and aircraft subjects, including visible airline/model/registration evidence.
scripts/identifier_visual_evidence.py - extracts visual evidence used by subject/metadata proof.
scripts/metadata_auto_repair_loop.py - bounded metadata repair loop for caption, alt text, and keywords.
scripts/metadata_quality_production.py - production metadata proof stage that writes metadata_quality.
scripts/series_versioning.py - detects series/variant context for related images.
scripts/subject_identifier_engine.py - shared subject identifier engine used by router flows.
scripts/subject_identifier_production.py - production subject suggestion/identifier script with aircraft group consolidation guardrails.

R6.4 Helper scripts (GitHub)

helpers/setup_venv313_full.ps1 - environment bootstrap and dependency installation.
helpers/preflight_multiset.ps1 - build-readiness checks and report generation.
helpers/build_multiset.ps1 - executable packaging flow.
helpers/runtime_hook_samevenv_classifier.py - frozen-runtime hook that exposes same-venv classifier dependencies/DLL paths to the EXE.

Note: local-only backup/sanitizer scripts that are not in the public GitHub branch are intentionally not listed.

R6.5 Repository docs (GitHub)

README.md - setup and runtime overview.
docs/init/review_queue.sql - docs copy of queue schema.
docs/init/photos_info_revamp.sql - docs copy of mirror schema.
docs/init/metadata_quality.sql - docs copy of metadata proof schema.

R6.6 Continuation path

Next reference: Design principles Back to reference 5 Back to Step 0