Summary
Refactor the repo into a small, installable SDK with a single CLI, an asset manager for model files, and a smoke-test suite. Keep all current scripts working, but route them through the new common APIs to reduce duplication and make onboarding dead simple.
Why
- Every classifier has its own entrypoint/IO. A common interface makes demos, docs, and CI far easier.
- Model weights are scattered (LFS/Drive/manual). A manifest + downloader prevents broken setups.
- Quick smoke tests in CI catch regressions (OpenCV/TF updates, path issues, missing assets).
Deliverables
-
SDK layer (installable)
-
New package: ai_ml_classifiers/
-
Base protocol:
class Classifier(Protocol):
name: str
tasks: tuple[str, ...] # e.g. ("vehicle",)
def load(self, device: str = "cpu") -> None: ...
def predict(self, source: str|Path|int, **kwargs) -> "Prediction": ...
-
Registry: from ai_ml_classifiers import get, list_tasks
-
Uniform Prediction dataclass: boxes/labels/scores or text for OCR/ASR, with timestamps for video.
-
CLI (aimc)
aimc run <task> --source webcam|image|video --device cpu|cuda --top-k 5
aimc assets sync (download all needed weights)
aimc assets doctor (checksums, paths)
aimc list (available tasks)
-
Asset Manager
assets/manifest.yaml for every weight/file: {id, task, filename, bytes, sha256, urls:[lfs,hf,gdrive]}
- Downloader with progress + checksum + cache (
~/.aimc/assets)
- Graceful fallbacks (try next URL if one fails)
-
Smoke tests (pytest)
- Tiny inputs per task in
assets/samples/
pytest -m smoke runs one frame/sample per classifier (skip if asset missing)
- CI: Ubuntu job runs
assets sync, then pytest -m smoke -q
-
Docs
- Top-level README: “Quickstart (CLI)”, “SDK usage”, “Assets”
- Per-task mini READMEs become short pages under
docs/ or sections in main README
- Table mapping old scripts → new commands
-
Back-compat
- Keep legacy scripts (
vehicle_detection.py, etc.) but refactor internals to call the SDK
- Flask app imports SDK instead of duplicating logic
-
Nice-to-have (optional if time)
--onnx path for 1–2 models to speed up CPU demos
--half (FP16) when CUDA is detected
- Simple benchmark:
aimc bench <task> --source image --repeat 50
Suggested file layout
ai-ml-classifiers/
ai_ml_classifiers/
__init__.py
api.py # registry, base types, Prediction
assets.py # downloader, checksums, cache
utils/io.py # image/video/webcam loaders
tasks/
vehicles.py
faces.py
mood.py
flowers.py
objects.py
ocr.py
animals.py
speech.py
sentiment.py
cli/aimc.py # click/typer CLI entrypoint
assets/manifest.yaml
tests/smoke/
test_vehicles.py
...
pyproject.toml # package + console_scripts = ["aimc=cli.aimc:main"]
Acceptance criteria
pip install -e . exposes aimc in PATH.
aimc list shows at least the 9 current tasks.
aimc assets sync downloads required weights; doctor reports OK with checksums.
aimc run vehicle --source assets/samples/traffic.mp4 produces labeled frames.
pytest -m smoke passes locally and in GitHub Actions.
- Legacy scripts continue to work (but now import from the SDK).
Task checklist
Notes / risks
- Large weights: keep LFS references, but manifest should include multiple mirrors.
- Platform deps: gate PyAudio/Tesseract tests with markers & skips; document install hints per OS.
- CUDA optional: auto-detect, but default to CPU; no hard runtime dependency on CUDA.
Summary
Refactor the repo into a small, installable SDK with a single CLI, an asset manager for model files, and a smoke-test suite. Keep all current scripts working, but route them through the new common APIs to reduce duplication and make onboarding dead simple.
Why
Deliverables
SDK layer (installable)
New package:
ai_ml_classifiers/Base protocol:
Registry:
from ai_ml_classifiers import get, list_tasksUniform
Predictiondataclass: boxes/labels/scores or text for OCR/ASR, with timestamps for video.CLI (
aimc)aimc run <task> --source webcam|image|video --device cpu|cuda --top-k 5aimc assets sync(download all needed weights)aimc assets doctor(checksums, paths)aimc list(available tasks)Asset Manager
assets/manifest.yamlfor every weight/file:{id, task, filename, bytes, sha256, urls:[lfs,hf,gdrive]}~/.aimc/assets)Smoke tests (pytest)
assets/samples/pytest -m smokeruns one frame/sample per classifier (skip if asset missing)assets sync, thenpytest -m smoke -qDocs
docs/or sections in main READMEBack-compat
vehicle_detection.py, etc.) but refactor internals to call the SDKNice-to-have (optional if time)
--onnxpath for 1–2 models to speed up CPU demos--half(FP16) when CUDA is detectedaimc bench <task> --source image --repeat 50Suggested file layout
Acceptance criteria
pip install -e .exposesaimcin PATH.aimc listshows at least the 9 current tasks.aimc assets syncdownloads required weights;doctorreports OK with checksums.aimc run vehicle --source assets/samples/traffic.mp4produces labeled frames.pytest -m smokepasses locally and in GitHub Actions.Task checklist
ai_ml_classifierspackage, baseClassifier+Predictionassets/manifest.yaml+ downloader with checksum verificationtyperorclick) withrun/list/assetssetup-python,pip install -e .,aimc assets sync,pytest -m smokeNotes / risks