Skip to content

Add BONSAI-style greedy pruning for NChooseK constraints#747

Open
jduerholt wants to merge 25 commits into
mainfrom
feature/bonsai
Open

Add BONSAI-style greedy pruning for NChooseK constraints#747
jduerholt wants to merge 25 commits into
mainfrom
feature/bonsai

Conversation

@jduerholt
Copy link
Copy Markdown
Contributor

Motivation

BONSAI for handling of NChooseK constraint (https://arxiv.org/abs/2602.07144).

@dlinzner-bcs: one could do the same in the DOE module.

Have you read the Contributing Guidelines on pull requests?

Yes.

Have you updated CHANGELOG.md?

Not yet.

Test Plan

Unit tests.

jduerholt and others added 2 commits March 26, 2026 17:13
Implement greedy post-processing pruning based on the BONSAI algorithm
(https://arxiv.org/abs/2602.07144) to handle NChooseK constraints more
effectively. Instead of encoding NChooseK as nonlinear constraints during
acquisition optimization, candidates are optimized without them and then
pruned in post-processing by greedily zeroing the feature with smallest
acquisition function impact until the constraint is satisfied.

- Add Domain.is_nchoosek_pruning_applicable() to check if pruning is safe
  (no NChooseK feature appears in any other constraint)
- Add exclude_nchoosek param to get_nonlinear_constraints()
- Skip NChooseK in optimize_acqf when pruning is applicable
- Override _postprocess_candidates in BotorchStrategy with Ax-style
  greedy pruning that conditions on already-pruned candidates for q>1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Relax is_nchoosek_pruning_applicable() to allow NChooseK features that
also participate in linear equality/inequality constraints (only block
for nonlinear/product/interpoint overlap). When linear overlap exists,
pruning variants use QP projection (LinearProjection) to find a feasible
warm-start with x_k=0, then local optimize_acqf to find the best
achievable AF value — giving accurate greedy decisions. The simple
zeroing path is preserved when no linear overlap exists.

- Relax is_nchoosek_pruning_applicable: linear overlap now allowed
- Add Domain.has_nchoosek_linear_overlap() to select pruning path
- Add QP + local optimize_acqf variant construction in botorch.py
- Add _build_pruning_domain to create modified domain for QP projection
- Add end-to-end test with NChooseK + mixture constraint (sum=1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jduerholt jduerholt marked this pull request as draft March 26, 2026 21:44
@jduerholt
Copy link
Copy Markdown
Contributor Author

@Jimbo994, if you are interested ;)

jduerholt added 13 commits May 5, 2026 09:43
Phase 1 of the BONSAI pruning refactor: a self-contained, tensor-native
implementation of greedy NChooseK pruning that can later be wired into
BotorchOptimizer. Strategy-side code in BotorchStrategy is untouched and
will be migrated in Phase 2.

The module extends today's pruning beyond what BotorchStrategy currently
does:

- active variant for semi-continuous features (allow_zero=True with
  lb > 0): each fractional value is resolved either to zero or snapped
  into [lb, ub] via QP projection + optional local optimize_acqf
- multiple NChooseK constraints with per-constraint active counts and
  conjunction-style termination
- per-constraint min_count guard with none_also_valid exception, raising
  PruningInfeasibleError when the action set empties before all
  max_count constraints are satisfied
- caller-supplied fixed_features flow through to both QP projection and
  local re-opt, and are excluded from the action set so the loop never
  proposes to move them
- two hyperparameters: per_step_local_reopt (refine each variant via
  optimize_acqf), final_local_reopt (single end-of-loop clean-up solve)
- pinned-zero-indices machinery prevents the linear-redistribution from
  resurrecting previously-zeroed features across iterations

The cvxpy LinearProjection round-trip is replaced by BoTorch's native
project_to_feasible_space_via_slsqp, eliminating the numpy boundary in
the pruning loop and dropping the bespoke pruning-domain construction.

The full conceptual write-up lives in docs/pruning.md, with a worked
4-feature example and a comparison to the published BONSAI algorithm.

Test coverage in tests/bofire/strategies/test_nchoosek_pruning.py (45
unit tests): pure helpers (classification, fulfilment, counts, guard,
eligibility), variant construction (zero/active, no-overlap, mixture eq,
inequality-only, infeasibility), end-to-end pruning (single/multiple
NChooseK, semi-continuous, min_count guard, hyperparameter smoke tests,
fixed-features regression, q=0, tie-break, QP-failure fallback).
Structural cleanup of the pruning module — no behavior change, all
118 + 6 new tests pass:

- Add Action, ActionKind, PruningContext, PruningState dataclasses;
  replace anonymous tuples and stringly-typed action kinds.
- Unify _build_zero_variant / _build_active_variant /
  _build_activate_variant behind a single _build_variant(kind, ...)
  dispatcher; keep the three legacy wrappers for backward compat.
- Extract _collect_actions(state, ctx, fixed_keys) -> List[Action]
  to fold the three duplicated action-collection for-loops.
- Extract _prune_single_candidate(x_i, X_prefix, ctx, fixed_keys)
  out of prune_nchoosek; the public function is now a 30-line
  outer loop and the inner per-candidate logic is a self-contained
  primitive that beam-search and B&B can build on directly.
- Replace the manual argmin loop with min(..., key=...).
- Drop the # noqa: C901 escape on prune_nchoosek.
- Add 5 dataclass smoke tests + 1 q>1 prefix-conditioning regression
  test (the existing test_q2 only checked that both candidates ran,
  not that the second conditioned on the first's pruned form).
Pairs missing tests with the categorical/discrete pinning fix already
landed in 8906fa0. The new TestPinnedColumns class covers the
per-row resolution semantics (q=1 and q=2 with row-varying values),
both reopt-path coverage, and the categorical one-hot integration
test. Also migrates three legacy tests from fixed_features= to the
new pinned_columns= API.

Adds two TODO comments in _build_variant referencing facebook/Ax#5180:
- Skip SLSQP projection when the action dim is not in any linear
  constraint (perf win).
- Post-projection feasibility safety net (defensive).

Both are documented but deferred -- small wins on a non-bottleneck
path, revisit if profiling shows _build_variant is hot.
is_pruning_applicable only blocks pruning when NChooseK features
*overlap* with an InterpointConstraint, NonlinearConstraint, or
ProductConstraint. If those constraints exist on non-NChooseK
features, the gate passes and pruning runs -- but the QP projection
inside prune_nchoosek only sees Linear{Equality,Inequality}
constraints (filtered by get_linear_constraints), so features
participating in the other constraint types may drift during
projection and silently break those constraints.

Fix: in BotorchOptimizer._prune_if_applicable, extend pinned_columns
to include every feature participating in those three unhandled
constraint types. The candidate carries values satisfying these
constraints at row entry (upstream optimizer respected them); pinning
preserves them by inertia.

Same defensive principle as the categorical/discrete fix in 8906fa0 --
anything pruning's QP cannot reason about, freeze at the per-row
value.

Adds a regression test asserting an InterpointEqualityConstraint
feature survives pruning unchanged.
BotorchOptimizer.validate_domain already rejected LSR-BO combined
with non-LinearConstraint via the existing
'LSR-BO only supported for linear constraints' check; since
NChooseKConstraint is not a LinearConstraint subclass, that branch
already caught NChooseK + LSR-BO at strategy-construction time.
The only remaining case the validator missed was semi-continuous
features (`allow_zero=True` with `bounds[0] > 0`) -- an input
property, not a constraint.

Extend validate_local_search_config to also reject semi-continuous
features. With both cases now caught at data-model validation, the
runtime NotImplementedError in BotorchOptimizer._optimize becomes
redundant and is removed.

User benefit: fail at strategy construction with a ValueError, not
at runtime in `ask()` with a NotImplementedError. Earlier signal,
discoverable in the data model.

Adds three validator tests: NChooseK + LSR-BO rejection (regression
on the existing branch), semi-continuous + LSR-BO rejection (new
branch), and plain-continuous + LSR-BO passes (the supported case).
Copy link
Copy Markdown
Contributor Author

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing big, small improvements.

Comment thread bofire/strategies/predictives/botorch.py Outdated
Comment thread bofire/strategies/predictives/botorch.py Outdated
Comment thread bofire/benchmarks/benchmark.py Outdated
For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf. Defaults to None.
relax_allow_zero (bool, optional): If True, semi-continuous continuous inputs
(`allow_zero=True` with positive lower bound) report a relaxed lower bound of 0.
Other input types ignore this flag. Defaults to False.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good idea to propagate it everywhere? Could we not have this via kwargs or so to not have to introduce this relax_allow_zero flag everywhere?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could consider (in another PR) having some kind of context manager that can handle this kind of thing? Similar to gpytorch settings. So I would imagine it could be used as

with bofire.optimization_settings.relax_allow_zero(True):
    strategy.ask(1)

This would avoid passing it to all these functions. The functions can then access the value of bofire.optimization_settings.relax_allow_zero to determine whether to relax.

Comment thread bofire/data_models/features/continuous.py
Comment thread bofire/strategies/random.py Outdated
Comment thread bofire/utils/torch_tools.py
Comment thread bofire/utils/torch_tools.py
Comment thread tests/bofire/data_models/domain/test_domain.py
Comment thread tests/bofire/strategies/test_pruning_gate.py Outdated
jduerholt added 5 commits May 12, 2026 16:26
Polish, structural cleanups, and small API improvements from the
review pass. Touches the pruning module, the BotorchOptimizer, the
RandomStrategy, the feature data models, and the corresponding tests.

Module changes:
- ContinuousInput.is_semicontinuous property; used everywhere the
  `allow_zero and bounds[0] > 0` check was open-coded.
- Feature.get_bounds: `relax_allow_zero` moves into `**kwargs`; only
  ContinuousInput declares it explicitly. Other subclasses drop the
  meaningless argument.
- get_torch_bounds_from_domain: new `reference_experiment` kwarg for
  local bounds (LSR-BO); BotorchOptimizer._optimize uses it instead
  of inlined get_bounds.
- Inputs.get_number_of_categorical_combinations: new
  `include_semicontinuous` flag; BotorchOptimizer._determine_optimizer
  passes False when pruning is applicable (replaces the inlined
  divide-out logic).
- get_nonlinear_constraints: `exclude_nchoosek` flag dropped --
  callers pass `includes=[ProductInequalityConstraint, ...]` to
  exclude NChooseK instead.
- _nchoosek_pruning:
  - Dataclasses (ActionKind, Action, PruningContext, PruningState)
    docstrings rewritten with explicit attribute sections.
  - _violated_constraints renamed to _max_count_violated_constraints
    for symmetry with _min_count_violated_constraints.
  - Legacy variant-builder wrappers (_build_zero_variant,
    _build_active_variant, _build_activate_variant) deleted; tests
    migrated to _build_variant(kind=...).
  - Removed defensive multi-column NChooseK guard (already enforced
    by BoFire validators) and the q==0 short-circuit (loop is
    already a no-op).
  - TODO comment in _local_optacqf referencing batch-optimization
    perf win (Ax PR #5180-style bundling of per-variant optimize_acqf
    calls).
- BotorchOptimizer:
  - _prune_if_applicable renamed to _prune.
  - Pinning policy widened: pin every column not in
    `{continuous, un-fixed, NChooseK / semi-continuous / linear-
    constraint-touching} \ {Interpoint / Nonlinear / Product
    constraint members}`.
- RandomStrategy: removed _subset_lp_feasible LP rejection. The
  polytope sampler's interior-point solver is the authoritative
  feasibility check via the existing InfeasibilityError try/except.
- benchmarks/benchmark.py: docstring updated for global NChooseK
  application.

Test changes:
- Added test_continuous_input_is_semicontinuous and
  test_continuous_input_feature_get_bounds_relax_allow_zero in
  test_continuous.py.
- Merged tests/bofire/strategies/test_pruning_gate.py (367 lines, 3
  classes) into test_nchoosek_pruning.py; old file deleted.
- Removed stale comment in test_domain.py pointing at the now-merged
  test_pruning_gate.py.
- Migrated 11 _build_zero_variant / _build_active_variant /
  _build_activate_variant call sites to _build_variant(kind=...).

477 targeted tests pass. Full-tree sweep matches baseline (4 pre-
existing failures unchanged; +618 collected tests).
The semi-continuous enumeration in get_number_of_categorical_combinations
and get_categorical_combinations was checking `allow_zero and not
is_fixed()` -- this works in practice because the ContinuousInput
validator forbids `allow_zero=True with bounds[0]==0`, but the intent is
clearer when expressed via the is_semicontinuous property.

Same defensive principle as the pinning policy: a feature is
"conditional" (gets the 2x active/inactive factor in the combination
count) only when its feasible region is the disconnected union
`{0} ∪ [lb, ub]` -- i.e., when `is_semicontinuous` is True. The
explicit `bounds[0] > 0` check inside the property guards against any
future validator relaxation that would allow allow_zero with zero
lower bound.
@jduerholt jduerholt marked this pull request as ready for review May 13, 2026 06:59
@jduerholt
Copy link
Copy Markdown
Contributor Author

Hi @bertiqwerty, @LukasHebing @TobyBoyne,

this is the new way of handling NChooseKs that I was cooking for the last month. First benchmarks show very superior performance in comparison to the original implementation via nonlinear constraints. Furthermore, it is much faster.

It would be nice if you all can have a look at this and review it. It would be nice if we can land it soon as it is from my perspective a big enhancements, as it now also fully supports semicontinuous features in optimization etc.

There is also much room for further improvement as outlined in the markdown file, I especially like the beam search but we can do it later ;)

Are you also interested in the benchmark results?

Best,

Johannes

Copy link
Copy Markdown
Collaborator

@TobyBoyne TobyBoyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and great that we're seeing fast and strong performance!

Most of my comments are about coding style, which you may want to ignore - if the code works, it works! But given there are plans to continue working on this and developing it, I think there is value in keeping the code clean. There is a lot of code duplication, and a few things are written in quite unintuitive ways.

The one thing I really don't understand is all of the X_pending stuff. I can't see why we need to do anything non-standard here. I would have thought BONSAI is independent to pending points in the acqf optimization. If you have any further explanation, or there is something in the BONSAI paper that explains it, I would appreciate it :)

For more details, it is referred to https://www.merl.com/publications/docs/TR2023-057.pdf. Defaults to None.
relax_allow_zero (bool, optional): If True, semi-continuous continuous inputs
(`allow_zero=True` with positive lower bound) report a relaxed lower bound of 0.
Other input types ignore this flag. Defaults to False.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could consider (in another PR) having some kind of context manager that can handle this kind of thing? Similar to gpytorch settings. So I would imagine it could be used as

with bofire.optimization_settings.relax_allow_zero(True):
    strategy.ask(1)

This would avoid passing it to all these functions. The functions can then access the value of bofire.optimization_settings.relax_allow_zero to determine whether to relax.

Comment thread bofire/strategies/predictives/_nchoosek_pruning.py Outdated
Comment thread bofire/strategies/predictives/_nchoosek_pruning.py
Comment thread bofire/strategies/predictives/_nchoosek_pruning.py Outdated
Comment thread bofire/strategies/predictives/_nchoosek_pruning.py
Comment thread bofire/strategies/predictives/_nchoosek_pruning.py Outdated
Comment on lines +410 to +414
constraint_indices: Set[int] = set()
for feat_key in c.features:
constraint_indices.update(features2idx[feat_key])
if j_idx not in constraint_indices:
continue
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Nit) I would rewrite this whole block as below - to me, it's clearer what the code below is doing.

if j_idx not in map(features2idx.get, c.features):
   continue

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been also updated due to the new helpers to reduce code duplication.

Comment thread bofire/strategies/predictives/_nchoosek_pruning.py Outdated
Comment thread bofire/strategies/predictives/_nchoosek_pruning.py
Comment thread bofire/strategies/predictives/_nchoosek_pruning.py Outdated
@TobyBoyne
Copy link
Copy Markdown
Collaborator

Thanks for responding to the reviews @jduerholt, no more changes requested! And yes regarding the the acqf optimization, even if BoFire did support analytic acqfs they don't support q>1 anyway. I still think that function is unecessary, since the botorch acquisition function optimizers don't mind modifying the acqf objects in-place, but happy to leave it.

I would be very interested to see the benchmark results! What benchmarks are you using? I don't think I have seen many NChooseK constraint problems in the BO literature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants