Scale RandomStrategy NChooseK sampling and add allow_zero support#757
Merged
Scale RandomStrategy NChooseK sampling and add allow_zero support#757
Conversation
Replace the up-front enumeration of valid NChooseK combinations in RandomStrategy with on-demand uniform sampling: a new Domain.sample_valid_nchoosek_features helper draws one set of active feature keys per call by picking subset size k weighted by C(n, k) within each constraint group, plus singleton groups for ContinuousInput features with allow_zero=True. This makes random sampling viable for domains where the combinatorial space is too large to enumerate (e.g. n=30, max_count=6 ~= 594k combos), and adds first-class allow_zero support outside of NChooseK. Overlapping NChooseKConstraints fall back to per-combination rejection sampling. Also adds a validator on NChooseKConstraint rejecting features with a negative lower bound, and ports the get_free()/get_fixed() handling for fixed categorical/discrete features in _sample_from_polytope. Co-Authored-By: Claude Opus 4.7 <[email protected]>
jduerholt
commented
Apr 28, 2026
Contributor
Author
jduerholt
left a comment
There was a problem hiding this comment.
First review for claude.
- sample_valid_nchoosek_features now takes seed: Optional[int] instead of a random.Random object, matching the convention used by Inputs.sample. The strategy passes self._get_seed(). - Drop the redundant feat.allow_zero = False reset before pinning a feature to bounds=[0, 0]; the ContinuousInput validator already exempts that case. - Add explanatory comments around the NChooseK sampling branch in RandomStrategy._sample_with_nchooseks. - Move the sample_valid_nchoosek_features tests from test_domain_nchoosek_combinatorics.py into test_domain.py and drop the redundant default-n test and the strategy scalability test. Co-Authored-By: Claude Opus 4.7 <[email protected]>
Co-Authored-By: Claude Opus 4.7 <[email protected]>
Contributor
Author
|
@TobyBoyne: can you review this PR? |
Collaborator
Sure thing, I'll review it now :) |
TobyBoyne
approved these changes
Apr 28, 2026
Collaborator
TobyBoyne
left a comment
There was a problem hiding this comment.
All looks good! Only very minor comments, I think this looks good as-is!
- Tighten NChooseKConstraint validator: feature must have bounds[0]==0 or allow_zero=True; add invalid Domain spec. - Move sample_valid_nchoosek_features from Domain to RandomStrategy as @staticmethod; remove legacy get_nchoosek_combinations and its tests. - Add separate nchoosek_max_iters to RandomStrategy and pass it through. - Drop redundant list() around con.features. Co-Authored-By: Claude Opus 4.7 <[email protected]>
Co-Authored-By: Claude Opus 4.7 <[email protected]>
- Extract _get_zeroable_keys static helper used by both _sample_with_nchooseks and sample_valid_nchoosek_features. - Hoist set(con.features) out of the rejection-sampling inner loop by precomputing con_feature_sets. - Replace manual combinations dict with collections.Counter. Co-Authored-By: Claude Opus 4.7 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Move ideas/implementations regarding random sampling from #693 into a seperate PR. This should massively speed up everything where random sampling and NChooseK constraints are involved.
Have you read the Contributing Guidelines on pull requests?
Yes.
Have you updated
CHANGELOG.md?Not yet.
Test Plan
Unit tests.