feat(persistent): in-place solver updates (Solver framework + HiGHS/Gurobi/Xpress/Mosek)#718
feat(persistent): in-place solver updates (Solver framework + HiGHS/Gurobi/Xpress/Mosek)#718FabianHofmann wants to merge 22 commits into
Conversation
Tracks per-Constraint coefficient mutation via a single boolean slot, flipped in coeffs/vars/lhs setters. Pure-constant rhs writes now short-circuit and leave coeffs/vars buffers untouched (by identity), so rhs-only updates don't trigger expensive coefficient recompare on the persistent-solver fast path.
Pure-Python snapshot primitives for the persistent-solver Phase 1. Deep-copies value-side fields (var_lb/ub, con_rhs/sign, obj_linear), holds vlabels/clabels by reference, stores canonical CSR (indptr, indices) per constraint container. No Solver import.
Pure-function diff for the persistent-solver Phase 1. Detects structural, coord, sparsity, quadratic-objective, value-only var/con, and objective-linear/sense changes. Supports same_model fast path via _coef_dirty and cross-model full re-scan. Includes a focused test suite covering capture, mutation paths, deep-copy invariant, and the same_model toggle.
- supports_persistent_update class flag (default False) - snapshot/_rebuilds/_in_place_updates/_last_rebuild_reason fields - snapshot capture at end of direct _build, _clear_coef_dirty helper - apply_update stub raising UnsupportedUpdate - solve(model, assign) dispatcher with diff-or-rebuild path - update(model, apply=True) primitive returning ModelDiff - threading.Lock around diff+apply+resnapshot - __getstate__/__setstate__ drop native handle and snapshot
…date support Skip diff computation entirely when supports_persistent_update is False on apply, per plan: 'dispatcher checks flag before calling — if False, skips diffing entirely and goes to rebuild.'
Replace xarray-based snapshot and CSR pattern compare with per-row canonicalised numpy buffers; new ContainerVarUpdate / ContainerRowUpdate payloads. Gurobi/HiGHS apply_update rewritten around batched setAttr / changeColsBounds / changeColsCost / changeColsIntegrality; coefficient writes touch only changed cells. Cross-model diff now ~matches same-model cost for bound/rhs/coef-value sweeps.
compute_diff/Solver.solve/Solver.update grow an ignore_dims kwarg. None (default) keeps the current no-coord-check behaviour; any iterable opts into per-container coord-equality on every dim not in the set, supporting rolling-horizon workflows where e.g. the snapshot dim is expected to drift.
…_rebuild - Solver.from_name now accepts model=None; the first solve(m, ...) builds. - compute_diff folded into ModelDiff.from_snapshot classmethod; new ModelDiff.from_models diffs two linopy models directly. - Solver.solve grows disallow_rebuild=True, which raises RebuildRequiredError instead of falling back to a rebuild.
…m_models - Add `track_updates` flag (default False) to Solver; skip ModelSnapshot capture when disabled. Raise UpdatesDisabledError on solve(model)/update() if a built solver was constructed without tracking. - Rewrite ModelDiff.from_models to build directly from two models without capturing snapshots; share helpers with from_snapshot. - Update persistent tests to opt into track_updates=True; add coverage for the disabled path.
Cross-instance resolves now diff via from_models against the previously built model, with no snapshot. Same-instance mutation still raises UpdatesDisabledError. Snapshot recapture is skipped in this mode. Add cross-instance solve/update tests for the no-snapshot path.
Collapse _diff_objective QUAD_OBJ branches; cache n_coef_updates; short-circuit _canonicalize_rows when rows already sorted; tighten buffer extraction. Introduce VarKind enum used across snapshot/diff and HiGHS/Gurobi apply_update; reuse linopy.constants sign tokens. Move _clear_coef_dirty into ModelSnapshot.capture.
Source con buffers from Constraint.to_matrix_with_rhs, replacing the dense (n_rows, max_n_term) arrays with CSR (indptr, indices, data). Sign dtype adopts 'U1' across the persistent layer and apply_update in HiGHS/Gurobi consumes CSR-slice payloads instead of -1 masks. Deletes _canonicalize_rows and the _INT64_MAX sentinel.
Replace per-container ContainerVarUpdate/ContainerRowUpdate dicts with flat arrays (var_bounds_*, var_type_*, con_coef_* COO, con_rhs_*, con_sign_*) plus VarSlice/ConSlice per-container offsets for diagnostics. Add con_rhs_as_bounds() for ranged-row solvers. Backend apply_update bodies collapse to flat-array calls; remove duplicated label->position resolution.
Implement in-place model updates for Xpress (chgbounds/chgrhs/chgmcoef/ chgrowtype/chgobj/chgobjsense/chgcoltype) and Mosek (chgvarbound/ chgconbound/putaijlist/putclist/putvartypelist/putobjsense). Mosek rejects constraint sign change to trigger rebuild. Consolidate gurobi/highs apply_update tests into a single parametrized file that also covers xpress and mosek.
for more information, see https://pre-commit.ci
|
@FBumann here we go, if you want to take a first look. docs to come |
Ill have a look ltoday |
* hold solver lock through _run_direct so two threads calling solve(model) on the same Solver no longer race on the native handle (HiGHS returned 0.0 from the second concurrent solve). * narrow Optional ndarrays in persistent.diff.push_var / push_con and in HiGHS/Gurobi/Xpress/Mosek apply_update objective paths. * widen Constraint.rhs setter to ExpressionLike | VariableLike | ConstantLike to match the as_expression call in the body. * widen Constraints.__getitem__(str) return type to Constraint (the dominant case) so tests can set .rhs/.coeffs/.sign without ignores. * add docs for in-place solver updates.
|
@FabianHofmann Sorry, I wont be able to review this today. |
take your time, there is no hurry. I'll do some integration tests anyway |
|
Are there any conflicts with #717? Are we sure we want to merge and publish this before we go v1? Just making sure... |
|
Benched PR #718 ModelDiff vs hand-rolled
Diff compute + Manually calling Suggestion: expose an opt-in on the persistent API so users with presolve-heavy LPs can drop the warm basis without bypassing the encapsulation. Either:
Zero structural rebuilds observed across all 7 re-solves in both ModelDiff routes ( Caveats:
|
|
Production-scale follow-up: PSA-MEE Solve-phase comparison (B vs SQ vs B+clear) at production scale couldn't be captured this round — the reduced-pipeline harness wasn't LP-feasible past chunk 1, so HiGHS timings came from an infeasibility path and would mislead. Will retry once the harness uses the full PSA-MEE chunk loop with storage + N-1 enabled. The presolve-skip mechanism that drove the synthetic regression (HiGHS logging Bottom line for the PR scope: the |
|
@MaykThewessen @FabianHofmann We might be able to add such later. But I'd definitely defer it from this first PR. We should put a note into the docs about users maybe profiting from calling such methods on the solver directly. One downside: We propose suage of |
|
I read through the code and overall api. I would propose the following api guide lines: 1. Mutation API:
|
| call | reads model | mutates handle | runs solver |
|---|---|---|---|
solver.diff(model) |
✓ | ||
solver.apply(model) |
✓ | ✓ | |
solver.solve(model=...) |
✓ | ✓ | ✓ |
update(model, apply=True|False)does two structurally different things behind one bool; defaultapply=Truesilently mutates.- Name collides with
gurobipy.Model.update()(lazy-flush attribute changes) — different semantic. - Split makes the basis-clear flow legible (
applyseparately fromsolve).
Good news: No conflicts with the new v1 semantics convention
|
@FBumann Agreed: drop Native handle via
A uniform My benchmark comment's "B+clear" route was exploratory, not a request for the wrapper. Happy to leave basis-management outside this PR and revisit only if a clear cross-solver use case emerges. Suggested docs note for
LGTM to defer the unified API: additive later, non-breaking. |
|
@FabianHofmann first round of my Review: ArchitectureA1. Through any supported mutation path — the existing setters at Two coherent positions:
Recommendation: B is the safer default. Cross-model paths already use False; in the same-model path the perf cost is small. Solver-aware callers who promise to use A2. Read-only A3. A4. Per-backend A5. Lock is too coarse. A6. Snapshot is the wrong unit of caching. DRYD1. D2. D3. Xpress D4. "Set binary integrality and then re-clamp bounds to [0,1]" is hand-rolled in HiGHS (1433-1442), Xpress (2485-2492), MOSEK (3221-3225). Lift to a Solver utility — the rule is solver-independent (it's linopy semantics, not a backend quirk). D5. |
Test coverage gaps —
|
| Reason | Covered? |
|---|---|
NONE |
✅ |
STRUCTURAL_CONTAINERS |
in {LABELS, CONTAINERS} sets |
STRUCTURAL_LABELS |
diff.py:557,606) unreachable from public API |
COORD_REINDEX |
✅ |
SPARSITY |
✅ |
QUAD_OBJ (diff.py:702) |
❌ no tests |
BACKEND_REJECTED |
✅ |
Gaps worth fixing:
QUAD_OBJhas zero tests — both directions (linear→quad, quad→linear) are reachable. Without coverage, QP users risk silent miscompare.- Loose
in {…}assertions intest_sparsity_change_triggers_rebuild,test_cross_model_sparsity_change_rebuilds,test_cross_model_structural_mismatch_rebuilds— tighten toisso future refactors of check ordering get caught. disallow_rebuild=Trueonly tested for one reason (STRUCTURAL_CONTAINERS). The contract is "every rebuild trigger raises" — add tests for SPARSITY, COORD_REINDEX, QUAD_OBJ, BACKEND_REJECTED.from_modelsnegative paths uncovered — only the happy path is tested. Mirror thefrom_snapshotcases (structural/sparsity/quad/coord) since the buffer-extraction path is independent.- Dead code or missing test at
diff.py:557and:606— the per-containerSTRUCTURAL_LABELSreturns can't fire after the global label check at:372-377short-circuits. Either delete (replace withassert) or write the test that justifies them.
|
@FBumann Strong review. Adding bench-grounded color on A1 / A5 / A6 since my numbers are the ones being cited. A1:
|
|
Correction to my A1 framing: I overshot. Microbench of the actual compare path (
So at the realistic production size (6 nnz/row), the extra walk is ~35 ms, not the 50-200 ms my last comment implied as the floor. That's ~22 % of a 160 ms apply phase, or ~0.3 % of a 12 s solve. Felix's original "small relative to solve time" framing was right; my "30-125 % of apply" was the worst-case interpretation, not the typical one. Doesn't change the recommendation (still Option B: default Bench script available on request, or I can drop it in |
Closes the A1 residual from the #718 review. The flag-trust path (`skip_coef_compare = same_model and not coef_dirty`) is correct through Constraint.update() (set in one place, shims forward), but `c.coeffs.values[...] = ...` still bypasses _coef_dirty. With same_model=True as the default, that bypass silently produces wrong diffs. Flip the default to False. Cross-model paths (the only production caller, Solver._update_locked, passes explicitly) are unaffected. Same-model warm-update paths now value-diff the CSR data — small perf hit (50-200ms at Mayk-scale per Mayk's bench), correct by default. Solver-aware callers who own the mutation contract can opt back into the optimization with `same_model=True`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Have not reviewed the code yet, just read the comments during some spare time, but my position is always to try to have perf by default:
So, i'd argue for the footgun A, but put the security mechanism on by flagging the numpy arrays as read only: a = np.array([1, 2, 3])
a.flags.writeable = False
a[0] = 99 # ValueError: assignment destination is read-only |
|
@coroa I wrote some comments about the user api here: #727 In short: We could skip the user parameter entirely by storing a Together with your suggestion about the writable flag this would be very safe, convenient and uses the performant branch whenever possible. |
|
@coroa I asked Claude code** about potential gaps of the writeable flag:
Feels 80% safe, but might produce rough edges in our internals and many places where we have to reassign this flag. |
|
@FBumann great review! I am still wondering what the benefit of an explicit |
|
The three proposals (coroa's writable-flag, @FBumann's weakref auto-decide, and the Unified shape: ModelSnapshot.capture(model):
snap.model_ref = weakref.ref(model)
snap.copy buffers (the existing snapshot.py:67-78 path)
freeze live model arrays: var.lower.values.flags.writeable = False, etc.
ModelDiff.from_snapshot(snap, model=None):
# weakref derives same_model automatically; no user-facing param
same_model = (model is None) or (snap.model_ref() is model)
Constraint.update(rhs=..., sign=..., coef=...):
with snap.thawed(): # context manager, re-entrant
... apply changes ...
flip _coef_dirty
# __exit__ re-freezes
Solver._update_locked:
with snap.thawed():
apply_update(diff)
re-capture or advance incrementallyEach proposal lands on a different layer:
That answers @FabianHofmann's "what does On the gap-list@FBumann the six gaps you listed are real, but the weights I'd put on them:
Two gaps not in the list:
Recommended landing order
Step 3 behind a flag lets you ship 1 + 2 immediately and gather real-world data on which internal paths break under freeze before forcing everyone through them. Mayk-scale benches stay green throughout (the freeze is free at runtime; it's the audit that costs). I can do the internal-path audit + thaw-context wrap as a follow-up PR if useful — mechanical work that benefits from running real PSA-MEE models against each candidate site. |
@FabianHofmann my main reasons are:
|
|
@FabianHofmann Updated my comment a bit |
|
@FBumann sounds good. I'll have a look at the pr this afternoon (tomorrow I will have more time to work on this in general) |
|
I don't understand the intricacies of model snapshots yet, so can't comment on the weakref proposal. mark's comment is close to what i had in mind with a context manager for making internal modifications. i'd hope the user does not have to specify whether she is passing the same model. i like the .update proposal, especially for cases where multiple things are updated. I'd keep the getters and allow the user to make in place updates but only when using the context manager which could set a dirty mark. |
Changes proposed in this Pull Request
Adds an opt-in persistent-update framework so a built solver can be re-solved against a mutated
Modelwithout a full rebuild.Core
linopy.persistent:ModelSnapshot,ModelDiff,StructuralKey,VarKind,RebuildReason.ModelDiffstores changes in flat-native arrays (bounds / var-types / RHS / signs / COO coefs / linear objective / sense) plus per-containerVarSlice/ConSliceviews.ModelDiff.from_snapshot(snap, model)andModelDiff.from_models(m1, m2)— snapshot-based and snapshot-free diffs._coef_dirtyflag on constraints with RHS-setter short-circuit so RHS-only edits skip the coefficient re-walk.Solver orchestration
Solvergainstrack_updates, lazy-build (firstsolve(model, …)builds),apply_update,update,disallow_rebuild, and structured rebuild reasons. Backends without persistent-update support short-circuit to rebuild.apply_update:changeColsBounds/changeColsIntegrality/changeRowBounds/changeCoeff/changeColsCost/changeObjectiveSense. Sign change → rebuild.setAttr(LB/UB/VType/RHS/Sense/Obj),chgCoeff,ModelSense. In-place sign change.chgbounds/chgcoltype/chgrhs/chgrowtype/chgmcoef/chgobj/chgobjsense. In-place sign change.chgvarbound/chgconbound/putvartypelist/putaijlist/putclist/putobjsense. Sign change → rebuild.Tests
test_persistent_snapshot_diff.pycovering allModelDiffsemantics.test_persistent_apply_update.pyrunning 9 cases × 4 backends (skipped per backend when license/installation is unavailable).test_persistent_solver_extras.py.Checklist
doc.doc/release_notes.rstof the upcoming release is included.