Add BO termination conditions by wgst · Pull Request #724 · experimental-design/bofire

wgst · 2026-02-13T02:50:23Z

Motivation

BO loops in BoFire currently run for a fixed number of iterations. In practice, especially for chemical experiments, where each iteration is very costly, we need a way to stop optimization early enough to reduce costs but still get a converged result.

This is an attempt to implement different termination conditions methods. The first method currently added in this PR is from Makarova, A. et al. (2022). The idea behind this method is to compute an upper bound on the simple regret of the current best recommendation using the GP posterior, and when this bound falls below a threshold ε_BO (derived from the observation noise variance), then we are confident that any further improvement will be smaller than the noise, and optimization can stop.

Two other termination methods - issues with implementing

EMSR Gap: Ishibashi, H. et al. (2023). This method requires computing an Expected Minimum Simple Regret gap between consecutive iterations, which needs per-iteration model snapshots or knowledge of which experiments belong to which iteration. I have not found any tracking functionalities within BoFire, which makes it hard to implement this method at the moment.
Probabilistic Regret Bound (PRB): Wilson, J. et al. (2024). This is a more probabilistic method, in which a statistically valid stopping rule is constructed using Matheron-rule pathwise trajectory samples (Thompson-style posterior samples via pathwise conditioning). Their implementation is based on TensorFlow and BO packages around it, and some crucial functionalities are not available in BoTorch.

Both methods could be good candidates for future work. Adding the method by Ishibashi et al should be fairly easy once the tracking architecture is settled. Method by Wilson could be a bit more problematic, but let's see.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Added tests:

data model (test_termination.py)
evaluator (test_evaluator.py)
mapper (test_mapper.py)

Best,
Wojciech

Copilot

Pull request overview

Adds support for Bayesian Optimization termination conditions to allow BO runs to stop early based on convergence criteria (initially: the UCB–LCB simple-regret upper bound inspired by Makarova et al., 2022). This integrates termination evaluation into the existing run() benchmark loop and adds a small termination submodule with data models, evaluators, and mapping utilities.

Changes:

Introduces termination-condition data models (MaxIterationsTermination, AlwaysContinue, UCBLCBRegretTermination, CombiTerminationCondition).
Adds a termination evaluator (UCBLCBRegretEvaluator) and mapper utilities to determine which evaluators are needed.
Extends the benchmark runner to optionally stop early and to return a richer RunResult including termination metadata (plus tests).

Reviewed changes

Copilot reviewed 11 out of 13 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
bofire/data_models/termination/termination.py	New termination-condition data models + `should_terminate` logic.
bofire/data_models/termination/api.py	Exports termination-condition data models.
bofire/data_models/termination/init.py	Package init (empty).
bofire/termination/evaluator.py	Implements evaluator computing UCB–LCB regret bound + noise estimate.
bofire/termination/mapper.py	Maps conditions to evaluators; recursively gathers needed evaluators.
bofire/termination/api.py	Exports evaluator + mapping helpers.
bofire/termination/init.py	Package init (empty).
bofire/runners/run.py	Adds `termination_condition` support, early-stop logic, and new `RunResult` return type.
bofire/runners/api.py	Re-exports `RunResult` from runners API.
tests/bofire/data_models/test_termination.py	Unit tests for termination condition data models + serialization.
tests/bofire/termination/test_evaluator.py	Unit + integration tests for evaluator and `run()` early termination behavior.
tests/bofire/termination/test_mapper.py	Tests for termination mapper and evaluator discovery.
tests/bofire/termination/init.py	Test package init.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-13T02:58:18Z

+        iteration: int,
+        **kwargs,
+    ) -> bool:
+        return iteration >= self.max_iterations


iteration in run() is 0-based (see RunResult.final_iteration == n_iterations-1 in tests), but this condition terminates when iteration >= max_iterations. That makes max_iterations behave like a max iteration index rather than a max number of iterations (off-by-one for typical expectations). Consider using iteration + 1 >= max_iterations (or renaming the field / documenting the 0-based convention).

Suggested change

return iteration >= self.max_iterations

return iteration + 1 >= self.max_iterations

Copilot · 2026-02-13T02:58:18Z

+@dataclass
+class RunResult:
+    """Result of a single optimization run.
+
+    Supports tuple-style access for backward compatibility with ``run()``,
+    which historically returned ``List[Tuple[pd.DataFrame, pd.Series]]``.
+    You can still write ``experiments, metric_values = result`` or
+    ``result[0]`` / ``result[1]``.
+    """
+
+    experiments: pd.DataFrame
+    metric_values: pd.Series
+    terminated_early: bool = False
+    final_iteration: int = 0
+    termination_metrics: Dict[str, List[float]] = field(default_factory=dict)
+
+    # Backward-compat: behave like (experiments, metric_values) tuple.
+    def __iter__(self):
+        return iter((self.experiments, self.metric_values))
+
+    def __getitem__(self, idx):
+        return (self.experiments, self.metric_values)[idx]
+
+


run() now returns List[RunResult] instead of the historical List[Tuple[pd.DataFrame, pd.Series]]. Even with tuple-style __iter__/__getitem__, this is still a public API/type-contract breaking change for callers that check types, serialize results, or rely on tuple methods. Consider either (a) keeping the tuple return and exposing termination info separately, or (b) using a NamedTuple / tuple subclass so isinstance(result, tuple) remains true, plus documenting the change in release notes.

Copilot · 2026-02-13T02:58:18Z

+    Where:
+    - UCB(x) = mu(x) + beta * sigma(x)  (Upper Confidence Bound)
+    - LCB(x) = mu(x) - beta * sigma(x)  (Lower Confidence Bound)


The UCB/LCB formulas in this docstring use beta * sigma(x), but the evaluator implementation uses sqrt(beta) * sigma(x) (GP-UCB style). Please align the documentation with the implementation (or vice versa) to avoid users configuring beta incorrectly.

Suggested change

Where:

- UCB(x) = mu(x) + beta * sigma(x) (Upper Confidence Bound)

- LCB(x) = mu(x) - beta * sigma(x) (Lower Confidence Bound)

Where (GP-UCB style):

- UCB(x) = mu(x) + sqrt(beta) * sigma(x) (Upper Confidence Bound)

- LCB(x) = mu(x) - sqrt(beta) * sigma(x) (Lower Confidence Bound)

Copilot · 2026-02-13T02:58:19Z

+        iteration: int,
+        **kwargs,
+    ) -> bool:
+        if iteration < self.min_iterations:


Similar to MaxIterationsTermination, iteration is currently treated as 0-based by the runner, but the min_iterations gate uses iteration < min_iterations. This means min_iterations=5 won’t start checking until the 6th BO step. Consider switching to iteration + 1 < min_iterations (or clearly documenting that iteration is a 0-based index).

Suggested change

if iteration < self.min_iterations:

# `iteration` is 0-based, but `min_iterations` represents a count of completed

# optimization steps. Convert to 1-based before applying the threshold so that,

# e.g., `min_iterations=5` starts checking on the 5th BO step, not the 6th.

if iteration + 1 < self.min_iterations:

Copilot · 2026-02-13T02:58:19Z

+        if evaluator is not None:
+            evaluators.append(evaluator)
+
+    return evaluators


get_required_evaluators can return duplicate evaluator instances when the same evaluator type appears multiple times in a (nested) CombiTerminationCondition. In the runner this will redundantly recompute metrics and also append multiple values per iteration under the same key in termination_metrics. Consider deduplicating by evaluator type (or by condition identity) before returning the list.

Suggested change

return evaluators

# Deduplicate evaluators by their type to avoid redundant computations

unique_evaluators = []

seen_types = set()

for evaluator in evaluators:

evaluator_type = type(evaluator)

if evaluator_type in seen_types:

continue

seen_types.add(evaluator_type)

unique_evaluators.append(evaluator)

return unique_evaluators

Copilot · 2026-02-13T02:58:19Z

+        # --- min UCB over evaluated points ---
+        evaluated_inputs = experiments[strategy.domain.inputs.get_keys()]
+        transformed_evaluated = strategy.domain.inputs.transform(
+            evaluated_inputs,
+            strategy.input_preprocessing_specs,
+        )
+        X_evaluated = torch.from_numpy(transformed_evaluated.values).to(**tkwargs)
+
+        with torch.no_grad():
+            posterior_evaluated = strategy.model.posterior(X_evaluated)
+            mean_evaluated = posterior_evaluated.mean.squeeze(-1)
+            std_evaluated = posterior_evaluated.variance.squeeze(-1).sqrt()
+
+        ucb_evaluated = mean_evaluated + sqrt_beta * std_evaluated
+        min_ucb_evaluated = float(ucb_evaluated.min().item())
+


There is no test coverage for using UCBLCBRegretTermination / UCBLCBRegretEvaluator on a maximization objective (where the bound logic should differ). Adding at least one test benchmark/domain with MaximizeObjective (or a simple synthetic domain) would prevent regressions and would have caught the current minimization-only implementation.

Copilot · 2026-02-13T02:58:20Z

@@ -1,2 +1,2 @@
 from bofire.runners.hyperoptimize import hyperoptimize
-from bofire.runners.run import run
+from bofire.runners.run import RunResult, run


Import of 'RunResult' is not used.
Import of 'run' is not used.

Suggested change

from bofire.runners.run import RunResult, run

Copilot · 2026-02-13T02:58:20Z

+            if hasattr(acqf, "beta"):
+                return acqf.beta
+        except AttributeError:
+            pass


'except' clause does nothing but pass and there is no explanatory comment.

Suggested change

pass

# If the strategy has no acquisition_function (or it raises AttributeError),

# fall back to the default beta value.

return self.fallback_beta

jduerholt · 2026-02-17T13:15:13Z

Hi @wgst,

thanks for the PR and sorry for the late answer. @bertiqwerty will provide a thorough and more detailed review. One general comment from my side:

I think it would be the best to incorparate the termination criteria not into the runner but into the strategy itself, we have the StepwiseStrategy that can be used to compose different atomic strategies into one including conditions when to execute which one (https://github.com/experimental-design/bofire/blob/main/bofire/data_models/strategies/stepwise/stepwise.py https://github.com/experimental-design/bofire/blob/main/bofire/strategies/stepwise/stepwise.py) I think it would be nice, to implement the stopping criteria as a condition for a certain stop. What do you think?

Best,

JOhannes

jduerholt · 2026-02-17T13:18:50Z

+        return iteration >= self.max_iterations
+
+
+class AlwaysContinue(TerminationCondition, EvaluatableTermination):


This one also already exisits in the StepwiseStrategy as a condition.

bertiqwerty · 2026-02-19T08:45:20Z

Thanks @wgst and @jduerholt. I agree with Johannes. Please use the StepwiseStrategy. If you need more guidance on how to do that, @wgst, let us have a call. I will have a closer look as soon as this refactored.

bertiqwerty

Please refactor into the StepwiseStrategy.

…LCB regret bound - Move termination logic from runner into StepwiseStrategy conditions - Add UCBLCBRegretBoundCondition with 4 threshold modes (gp_noise, range, cv, manual) - Add UCBLCBRegretEvaluator with configurable beta (GP-UCB, Srinivas et al. 2010) - Add top-q filtering support for regret bound computation - Extract _min_ucb_evaluated() method for clarity - Add OptimizationComplete exception for clean termination signaling - Remove old termination data models, mapper, and tests (replaced by conditions) - Add comprehensive test suite (test_condition.py, test_evaluator.py) - All 49 tests passing

…evaluators Option A refactoring: both UCBLCBRegretEvaluator and ExpMinRegretGapEvaluator now support lcb_method parameter to choose between "sample" (default) and "optimize" LCB computation over the domain. Changes: - Convert _ucb_lcb_regret_bound from static to instance method - Add lcb_method parameter to ExpMinRegretGapEvaluator (was Makarova-only) - Move _min_lcb_optimize to base TerminationEvaluator class for code reuse - Unify LCB computation logic in both evaluators via _ucb_lcb_regret_bound - Add threshold computation helpers module (thresholds.py) - Update tests to verify both methods work correctly Both evaluators maintain backward compatibility (default "sample" method). Optimize method uses strategy's acqf_optimizer for more efficient LCB search.

…_min_lcb_optimize to base class Convert _ucb_lcb_regret_bound from static to instance method to access self.lcb_method. Move _min_lcb_optimize to TerminationEvaluator base class for code reuse by both Makarova and Ishibashi evaluators. Both evaluators now flexibly support "sample" (default) and "optimize" methods for LCB computation.

Refactor UCBLCBRegretBoundCondition and ExpMinRegretGapCondition to use new threshold computation helpers from thresholds module. Update data model exports.

Update benchmark functions and exports to support termination condition testing.

Add tests for new lcb_method parameter in both Makarova and Ishibashi evaluators. Verify both "sample" and "optimize" methods work correctly. Update condition specs and evaluator tests to reflect refactored code structure.

Add BO termination conditions based on UCB-LCB regret bound

c862313

Copilot AI review requested due to automatic review settings February 13, 2026 02:50

Copilot started reviewing on behalf of wgst February 13, 2026 02:50 View session

Copilot AI reviewed Feb 13, 2026

View reviewed changes

jduerholt requested a review from bertiqwerty February 17, 2026 12:47

jduerholt reviewed Feb 17, 2026

View reviewed changes

bertiqwerty requested changes Feb 19, 2026

View reviewed changes

wgst added 6 commits March 18, 2026 16:36

Update termination condition data models to use threshold helpers

15bb205

Refactor UCBLCBRegretBoundCondition and ExpMinRegretGapCondition to use new threshold computation helpers from thresholds module. Update data model exports.

Add benchmark updates for termination testing

58d0724

Update benchmark functions and exports to support termination condition testing.

wgst marked this pull request as draft April 30, 2026 18:22

wgst mentioned this pull request May 8, 2026

Add LogExpectedImprovementPerCost acquisition function meta-pytorch/botorch#3304

Open

	return iteration >= self.max_iterations
	return iteration + 1 >= self.max_iterations

-    Where:
-    - UCB(x) = mu(x) + beta * sigma(x)  (Upper Confidence Bound)
-    - LCB(x) = mu(x) - beta * sigma(x)  (Lower Confidence Bound)
+    Where (GP-UCB style):
+    - UCB(x) = mu(x) + sqrt(beta) * sigma(x)  (Upper Confidence Bound)
+    - LCB(x) = mu(x) - sqrt(beta) * sigma(x)  (Lower Confidence Bound)

-        if iteration < self.min_iterations:
+        # `iteration` is 0-based, but `min_iterations` represents a count of completed
+        # optimization steps. Convert to 1-based before applying the threshold so that,
+        # e.g., `min_iterations=5` starts checking on the 5th BO step, not the 6th.
+        if iteration + 1 < self.min_iterations:

-    return evaluators
+    # Deduplicate evaluators by their type to avoid redundant computations
+    unique_evaluators = []
+    seen_types = set()
+    for evaluator in evaluators:
+        evaluator_type = type(evaluator)
+        if evaluator_type in seen_types:
+            continue
+        seen_types.add(evaluator_type)
+        unique_evaluators.append(evaluator)
+    return unique_evaluators

-            pass
+            # If the strategy has no acquisition_function (or it raises AttributeError),
+            # fall back to the default beta value.
+            return self.fallback_beta

		return iteration >= self.max_iterations


		class AlwaysContinue(TerminationCondition, EvaluatableTermination):

Conversation

wgst commented Feb 13, 2026

Motivation

Two other termination methods - issues with implementing

Have you read the Contributing Guidelines on pull requests?

Test Plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

jduerholt commented Feb 17, 2026

Uh oh!

jduerholt Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

bertiqwerty commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertiqwerty left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bertiqwerty commented Feb 19, 2026 •

edited

Loading