Skip to content

Scale RandomStrategy NChooseK sampling and add allow_zero support#757

Merged
jduerholt merged 11 commits intomainfrom
refactor/random_nchoosek
Apr 29, 2026
Merged

Scale RandomStrategy NChooseK sampling and add allow_zero support#757
jduerholt merged 11 commits intomainfrom
refactor/random_nchoosek

Conversation

@jduerholt
Copy link
Copy Markdown
Contributor

Motivation

Move ideas/implementations regarding random sampling from #693 into a seperate PR. This should massively speed up everything where random sampling and NChooseK constraints are involved.

Have you read the Contributing Guidelines on pull requests?

Yes.

Have you updated CHANGELOG.md?

Not yet.

Test Plan

Unit tests.

Replace the up-front enumeration of valid NChooseK combinations in
RandomStrategy with on-demand uniform sampling: a new
Domain.sample_valid_nchoosek_features helper draws one set of active
feature keys per call by picking subset size k weighted by C(n, k)
within each constraint group, plus singleton groups for ContinuousInput
features with allow_zero=True. This makes random sampling viable for
domains where the combinatorial space is too large to enumerate (e.g.
n=30, max_count=6 ~= 594k combos), and adds first-class allow_zero
support outside of NChooseK. Overlapping NChooseKConstraints fall back
to per-combination rejection sampling.

Also adds a validator on NChooseKConstraint rejecting features with a
negative lower bound, and ports the get_free()/get_fixed() handling for
fixed categorical/discrete features in _sample_from_polytope.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Copy link
Copy Markdown
Contributor Author

@jduerholt jduerholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First review for claude.

Comment thread bofire/data_models/domain/domain.py Outdated
Comment thread bofire/strategies/random.py
Comment thread bofire/strategies/random.py Outdated
Comment thread bofire/strategies/random.py Outdated
Comment thread tests/bofire/data_models/domain/test_domain_nchoosek_combinatorics.py Outdated
Comment thread tests/bofire/data_models/domain/test_domain_nchoosek_combinatorics.py Outdated
Comment thread tests/bofire/strategies/test_random.py Outdated
jduerholt and others added 2 commits April 28, 2026 15:58
- sample_valid_nchoosek_features now takes seed: Optional[int] instead
  of a random.Random object, matching the convention used by
  Inputs.sample. The strategy passes self._get_seed().
- Drop the redundant feat.allow_zero = False reset before pinning a
  feature to bounds=[0, 0]; the ContinuousInput validator already
  exempts that case.
- Add explanatory comments around the NChooseK sampling branch in
  RandomStrategy._sample_with_nchooseks.
- Move the sample_valid_nchoosek_features tests from
  test_domain_nchoosek_combinatorics.py into test_domain.py and drop
  the redundant default-n test and the strategy scalability test.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@jduerholt jduerholt requested a review from TobyBoyne April 28, 2026 14:10
@jduerholt
Copy link
Copy Markdown
Contributor Author

@TobyBoyne: can you review this PR?

@TobyBoyne
Copy link
Copy Markdown
Collaborator

@TobyBoyne: can you review this PR?

Sure thing, I'll review it now :)

Copy link
Copy Markdown
Collaborator

@TobyBoyne TobyBoyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good! Only very minor comments, I think this looks good as-is!

Comment thread bofire/data_models/constraints/nchoosek.py Outdated
Comment thread bofire/data_models/domain/domain.py Outdated
Comment thread bofire/data_models/domain/domain.py Outdated
Comment thread bofire/data_models/domain/domain.py Outdated
Comment thread bofire/data_models/domain/domain.py Outdated
Comment thread bofire/strategies/random.py Outdated
Comment thread tests/bofire/strategies/test_random.py
jduerholt and others added 8 commits April 29, 2026 09:05
- Tighten NChooseKConstraint validator: feature must have bounds[0]==0
  or allow_zero=True; add invalid Domain spec.
- Move sample_valid_nchoosek_features from Domain to RandomStrategy as
  @staticmethod; remove legacy get_nchoosek_combinations and its tests.
- Add separate nchoosek_max_iters to RandomStrategy and pass it through.
- Drop redundant list() around con.features.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
- Extract _get_zeroable_keys static helper used by both
  _sample_with_nchooseks and sample_valid_nchoosek_features.
- Hoist set(con.features) out of the rejection-sampling inner loop
  by precomputing con_feature_sets.
- Replace manual combinations dict with collections.Counter.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@jduerholt jduerholt merged commit 18bf324 into main Apr 29, 2026
12 checks passed
@jduerholt jduerholt deleted the refactor/random_nchoosek branch April 29, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants