Nonlinear constraints to BoFire (integration with acqf + tests + tutorials)#740
Nonlinear constraints to BoFire (integration with acqf + tests + tutorials)#740MrzvskK wants to merge 29 commits intoexperimental-design:mainfrom
Conversation
Add files via upload
…_Reaction_Optimization.qmd Co-authored-by: Johannes P. Dürholt <johannespeter.duerholt@evonik.com>
- Added filter_candidates_by_constraints() to filter feasible points - Added get_constraint_violations() to calculate violation amounts - Exported new functions in api.py - Added comprehensive tests
…torial runs smoothly
|
Hi @MrzvskK, Does this help? |
|
@LukasHebing, this PR adds functionality to the BotorchOptimizer to handle nonlinear constraints. |
Ok, great :) |
|
I fixed most of the issues, except of the last failing test, it has something to do with the conda. |
|
@jduerholt hi fixed all the bugs, should be ready to merge |
|
@jduerholt Hello, this is a quick reminder, I wanted to check if you approve the changes or not. |
|
Ah sorry, did not see that the tests are now passing, I will review over the weekend or beginning of next week ;) |
|
@jduerholt kind reminder, did you have time to take a look at the PR? |
Sorry, too much travelling, I will do it over the weekend or the latest on Monday. |
jduerholt
left a comment
There was a problem hiding this comment.
Hi @MrzvskK,
thank you very much for the PR. I went over the actual implementation (ignoring the tests) for now and provided inline feedback. Regarding the botorch based suggestions, I also created an issue there: meta-pytorch/botorch#3280
Best,
Johannes
| return hessian_expression | ||
|
|
||
| def __call__(self, experiments: pd.DataFrame) -> pd.Series: | ||
| def __call__( |
There was a problem hiding this comment.
From my perspective, this is highly botorch dependable code, for this reason, I propose to move it to utils/torch_tools.py, where also the other constraints are prepared to be botorch ready. Then we also do not need the option here to pass torch ensors into the function etc, and it is much cleaner to read. You can have a look there how the ProductConstraint is build up.
|
|
||
| type: Literal["NonlinearEqualityConstraint"] = "NonlinearEqualityConstraint" | ||
|
|
||
| def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Series: |
There was a problem hiding this comment.
Honestly, I would not support them directly out of the box in bofire, as they are not naturally supported in botorch. If somebody wants to do something like this, he or she should do this breakdown into individual nonlinear constraints by hand.
| constraints.NonlinearEqualityConstraint, | ||
| ]: | ||
| return False | ||
| return True # was False |
There was a problem hiding this comment.
I would only support the NonlinearInequalityConstraint and would the rest be handled by the user.
| if self.local_search_config is not None: | ||
| if has_local_search_region(domain) is False: | ||
| warnings.warn( | ||
| logger.info( |
There was a problem hiding this comment.
Why this change from warning to logger, the logging mechanism in BoFire is somehow not well implemented, so I would like to keep it as warning for now. We need an overhaul there ...
There was a problem hiding this comment.
sorry, I forgot to revert it.
| ) | ||
| # Use object.__setattr__ to bypass Pydantic's frozen model behavior | ||
| object.__setattr__(self, "batch_limit", 1) | ||
| if self.n_restarts != 1: |
There was a problem hiding this comment.
I do not think, that we need n_restarts ==1, it should also work with more restarts, but it will run them sequentially. So this can go, I think.
| return OptimizerEnum.OPTIMIZE_ACQF_MIXED | ||
| return OptimizerEnum.OPTIMIZE_ACQF_MIXED_ALTERNATING | ||
|
|
||
| def _get_nonlinear_constraint_setup( |
There was a problem hiding this comment.
Hmm, I do not understand why this is necessary, currently we use our random strategy as initical candidates generator, and the random strategy can already handle nonlinear constraints via rejection sampling. Why not keep it as is?
There was a problem hiding this comment.
The motivation for _get_nonlinear_constraint_setup is that once we pass nonlinear_inequality_constraints to BoTorch, it validates feasibility in tensor space with its own convention/tolerances, and ICs produced via BoFire rejection sampling can still be rejected there. If we decide not to pass nonlinear constraints into BoTorch and rely purely on BoFire’s rejection sampling, then I agree we can drop most of this setup and keep the implementation simpler. I will double check the RandomStrategy, since it might be doing the same thing.
There was a problem hiding this comment.
I will explore get_initial_conditions_generator
| domain, constraint=LinearEqualityConstraint | ||
| ) | ||
| if len(nonlinear_constraints := get_nonlinear_constraints(domain)) == 0: | ||
|
|
There was a problem hiding this comment.
I also do not understand this, this should all be handled in the original implementation using the random strategy as generator for the feasible candidates.
| def has_sufficient_experiments( | ||
| self, | ||
| ) -> bool: | ||
| """Check if sufficient feasible experiments are available. |
There was a problem hiding this comment.
Hmm, I am not convinced that this is a good idea, from my perspective there is no need that past/historical experiments that were just used to train the surrogate models should fulfill the constraints. Often we have usecases, where new constraints are added over the course of the project, and this should keep working.
What was your intention for adding?
There was a problem hiding this comment.
I can revert this change.
| If the domain is compatible with polytope sampling, it uses the polytope sampling to generate | ||
| candidate samples. Otherwise, it performs rejection sampling by repeatedly generating candidate | ||
| samples until the desired number of valid samples is obtained. | ||
| def _ask(self, candidate_count: int) -> pd.DataFrame: |
There was a problem hiding this comment.
Hmm, I do not get why this change was necessary, can you elaborate a bit?
| # The constraint should evaluate to <= 0 | ||
| def make_constraint_callable(c): | ||
| def constraint_fn(x: Tensor) -> Tensor: | ||
| # c.__call__ expects x with shape (batch_size, n_features) |
There was a problem hiding this comment.
I would move all the logic on handling torch tensors of different here.
Motivation
This is an early attempt, I am sure there will be a few changes needed along the way. But to start with something, so Nonlinear constraints (
NonlinearInequalityConstraintandNonlinearEqualityConstraint) were defined in BoFire's domain model but were not integrated into the optimization pipeline. The constraints were extracted from the domain but never passed to BoTorch'soptimize_acqf, causing the optimizer to ignore them and propose infeasible candidates.This PR:
acqf_optimization.pyto properly pass constraints to BoTorchis_equalityflag was inverted (all constraints were treated as equality constraints)torch_tools.pyto properly format constraints for BoTorch compatibilitybotorch.pyto ensure proposed candidates satisfy all constraintsHave you read the Contributing Guidelines on pull requests?
Yes
Have you updated
CHANGELOG.md?I will do that later
New Test Suite
Created
tests/bofire/strategies/test_nonlinear_constraints.pywith 16 comprehensive test cases:Constraint Types Tested:
Optimization Scenarios:
Results:
Tutorial
Created
docs/tutorials/basic_examples/nonlinear_constraints_basic_usage.pydemonstrating:Verification Steps
To verify the changes work: