Unified SDK + CLI + Asset Manager + Smoke Tests

### Summary

Refactor the repo into a small, installable **SDK** with a **single CLI**, an **asset manager** for model files, and a **smoke-test suite**. Keep all current scripts working, but route them through the new common APIs to reduce duplication and make onboarding dead simple.

---

#### Why

* Every classifier has its own entrypoint/IO. A common interface makes demos, docs, and CI far easier.
* Model weights are scattered (LFS/Drive/manual). A manifest + downloader prevents broken setups.
* Quick smoke tests in CI catch regressions (OpenCV/TF updates, path issues, missing assets).

---

#### Deliverables

1. **SDK layer (installable)**

   * New package: `ai_ml_classifiers/`
   * Base protocol:

     ```python
     class Classifier(Protocol):
         name: str
         tasks: tuple[str, ...]  # e.g. ("vehicle",)
         def load(self, device: str = "cpu") -> None: ...
         def predict(self, source: str|Path|int, **kwargs) -> "Prediction": ...
     ```
   * Registry: `from ai_ml_classifiers import get, list_tasks`
   * Uniform `Prediction` dataclass: boxes/labels/scores or text for OCR/ASR, with timestamps for video.

2. **CLI (`aimc`)**

   * `aimc run <task> --source webcam|image|video --device cpu|cuda --top-k 5`
   * `aimc assets sync` (download all needed weights)
     `aimc assets doctor` (checksums, paths)
   * `aimc list` (available tasks)

3. **Asset Manager**

   * `assets/manifest.yaml` for every weight/file: `{id, task, filename, bytes, sha256, urls:[lfs,hf,gdrive]}`
   * Downloader with progress + checksum + cache (`~/.aimc/assets`)
   * Graceful fallbacks (try next URL if one fails)

4. **Smoke tests (pytest)**

   * Tiny inputs per task in `assets/samples/`
   * `pytest -m smoke` runs one frame/sample per classifier (skip if asset missing)
   * CI: Ubuntu job runs `assets sync`, then `pytest -m smoke -q`

5. **Docs**

   * Top-level README: “Quickstart (CLI)”, “SDK usage”, “Assets”
   * Per-task mini READMEs become short pages under `docs/` or sections in main README
   * Table mapping old scripts → new commands

6. **Back-compat**

   * Keep legacy scripts (`vehicle_detection.py`, etc.) but refactor internals to call the SDK
   * Flask app imports SDK instead of duplicating logic

7. **Nice-to-have (optional if time)**

   * `--onnx` path for 1–2 models to speed up CPU demos
   * `--half` (FP16) when CUDA is detected
   * Simple benchmark: `aimc bench <task> --source image --repeat 50`

---

#### Suggested file layout

```
ai-ml-classifiers/
  ai_ml_classifiers/
    __init__.py
    api.py               # registry, base types, Prediction
    assets.py            # downloader, checksums, cache
    utils/io.py          # image/video/webcam loaders
    tasks/
      vehicles.py
      faces.py
      mood.py
      flowers.py
      objects.py
      ocr.py
      animals.py
      speech.py
      sentiment.py
  cli/aimc.py            # click/typer CLI entrypoint
  assets/manifest.yaml
  tests/smoke/
    test_vehicles.py
    ...
  pyproject.toml         # package + console_scripts = ["aimc=cli.aimc:main"]
```

---

#### Acceptance criteria

* `pip install -e .` exposes `aimc` in PATH.
* `aimc list` shows at least the 9 current tasks.
* `aimc assets sync` downloads required weights; `doctor` reports OK with checksums.
* `aimc run vehicle --source assets/samples/traffic.mp4` produces labeled frames.
* `pytest -m smoke` passes locally and in GitHub Actions.
* Legacy scripts continue to work (but now import from the SDK).

---

#### Task checklist

* [ ] Create `ai_ml_classifiers` package, base `Classifier` + `Prediction`
* [ ] Implement registry + per-task adapters (wrap existing code)
* [ ] Implement `assets/manifest.yaml` + downloader with checksum verification
* [ ] Add CLI (`typer` or `click`) with `run/list/assets`
* [ ] Add sample inputs and tiny gold outputs for smoke tests
* [ ] Wire GitHub Actions: `setup-python`, `pip install -e .`, `aimc assets sync`, `pytest -m smoke`
* [ ] Update README (Quickstart, SDK, Assets, CI)
* [ ] Refactor Flask app to call SDK (keep routes/UX unchanged)
* [ ] Mark legacy scripts as “thin wrappers” (one-line import + call)

---

#### Notes / risks

* **Large weights:** keep LFS references, but manifest should include **multiple mirrors**.
* **Platform deps:** gate PyAudio/Tesseract tests with markers & skips; document install hints per OS.
* **CUDA optional:** auto-detect, but default to CPU; no hard runtime dependency on CUDA.

---


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unified SDK + CLI + Asset Manager + Smoke Tests #20

Summary

Why

Deliverables

Suggested file layout

Acceptance criteria

Task checklist

Notes / risks

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unified SDK + CLI + Asset Manager + Smoke Tests #20

Description

Summary

Why

Deliverables

Suggested file layout

Acceptance criteria

Task checklist

Notes / risks

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions