-
Notifications
You must be signed in to change notification settings - Fork 77
Add DiscreteBatchConstraint
#765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
2d63d2b
a84527f
0d2a813
dc8929e
1c3aaf7
c8928b8
8128968
5bb43cd
f78dab3
48efdfe
6039baf
b79643f
3417f91
77818e9
4e76ccf
c7597e8
273567c
7fdfb5c
5ab2a6d
231f82a
f27cbca
ce9c5e3
f8c1905
32a6edf
86c87b2
d212060
a835cd4
41902c3
d76a6a3
2f14ea6
5ebccad
bc19835
a980be9
615b581
e6b60fc
7a5d725
1ee36c2
86aaaf9
c1b33ac
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,6 +7,8 @@ | |
| from functools import reduce | ||
| from typing import TYPE_CHECKING, Any, ClassVar, cast | ||
|
|
||
| import numpy as np | ||
| import numpy.typing as npt | ||
| import pandas as pd | ||
| from attrs import define, field | ||
| from attrs.validators import in_, min_len | ||
|
|
@@ -424,6 +426,72 @@ def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index: | |
| return df.index[mask_bad] | ||
|
|
||
|
|
||
| @define | ||
| class DiscreteBatchConstraint(DiscreteConstraint): | ||
| """Constraint ensuring recommendations in a batch share certain parameter values. | ||
|
|
||
| When this constraint is active, the recommender internally subsets the | ||
| candidate set (one subset for each unique value of the constrained | ||
| parameter), obtains a full batch recommendation from each subset, and | ||
| returns the batch with the highest joint acquisition value. | ||
|
|
||
| This constraint is not supported by all recommenders. It is not applied during | ||
| search space creation (all parameter values remain in the search space). | ||
|
|
||
| Example: | ||
| If parameter ``Temperature`` has values ``[50, 100, 150]`` and a batch of | ||
| 10 is requested, the recommender will generate three candidate batches | ||
| (one all-50, one all-100, one all-150) and return the best one. | ||
|
AVHopp marked this conversation as resolved.
|
||
|
|
||
| Notes: | ||
| This constraint can lead to overhead in the computation since optimization | ||
| results in individual optimizations over several subsets. If there are | ||
| multiple subset-generating constraints active, this can drastically increase | ||
| the computational cost due to the combinatorial explosion. | ||
| """ | ||
|
|
||
| # Class variables | ||
| eval_during_creation: ClassVar[bool] = False | ||
| eval_during_modeling: ClassVar[bool] = True | ||
| numerical_only: ClassVar[bool] = False | ||
|
|
||
| def __attrs_post_init__(self): | ||
| """Validate that exactly one parameter is specified.""" | ||
| if len(self.parameters) != 1: | ||
| raise ValueError( | ||
| f"'{self.__class__.__name__}' requires exactly one parameter, " | ||
| f"but {len(self.parameters)} were provided: {self.parameters}." | ||
|
Comment on lines
+458
to
+463
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we get rid of this restriction? My first naive guess would be that it should not be too hard?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is unnecessary
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Really? To me, it's rather the current design that is super unintuitive. We have an object that takes Where do you see potential confusion? I can only see one way how to interpret a call like
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so there is technical debt here causing this confusion, it is just But it has now driven me into this weird design that the constraint has I would much prefer A:
With the alternative being B:
I prefer A because the filtering logic is entirely orthogonal, so expressing it as independent constraints make sense to me. I already made the plan to change this also for the dependencies constraint here #670 so far I have not received any differing opinion I could live with both variants, but it should be amde consistent with the dependencies as well, so let me know. But please do not make your call based on the
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Or alternative C:
I don't know why the above should not be possible, and it would be the most natural option for me since:
While you could of course achieve the same with three or just one constraint, why take away the possibility to structure it for no reason? But I think a third opinion is a good idea here: @AVHopp, @kalama-ai?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think allowing that (option C) would be worst of both worlds - degeneracy in configurations should be avoided especially if its easy (like in this case) AND it would be inconsistent with the cardinality constraints where only one set of parameters is specified instead of potentially multiple sets (the set of parameters over which the constraint holds is equivalent to a single parameter in the batch constraint over which the batch constraint holds) |
||
| ) | ||
|
|
||
| @override | ||
| def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index: | ||
| # Always returns an empty index because this constraint operates at the | ||
| # batch level, not the row level. Individual rows are never invalid; the | ||
| # constraint is enforced at recommendation time by subsetting candidates | ||
| # into subsets. | ||
| return pd.Index([]) | ||
|
|
||
| def subset_masks( | ||
| self, candidates_exp: pd.DataFrame, / | ||
| ) -> list[npt.NDArray[np.bool_]]: | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. thre is this consuming code somehwere down the line |
||
| """Return Boolean masks defining the subsets for this constraint. | ||
|
|
||
| Each mask selects the rows in ``candidates_exp`` that belong to one | ||
| subset, i.e. share the same value for the constrained parameter. | ||
|
|
||
| Args: | ||
| candidates_exp: The experimental representation of candidate points. | ||
|
|
||
| Returns: | ||
| A list of Boolean masks, one per unique value of the constrained | ||
| parameter. | ||
| """ | ||
| param = self.parameters[0] | ||
| return [ | ||
|
AVHopp marked this conversation as resolved.
|
||
| (candidates_exp[param] == v).values for v in candidates_exp[param].unique() | ||
| ] | ||
|
|
||
|
|
||
| @define | ||
| class DiscreteCardinalityConstraint(CardinalityConstraint, DiscreteConstraint): | ||
| """Class for discrete cardinality constraints.""" | ||
|
|
@@ -466,6 +534,7 @@ def _get_invalid(self, df: pd.DataFrame, /) -> pd.Index: | |
| DiscreteCustomConstraint, | ||
| DiscretePermutationInvarianceConstraint, | ||
| DiscreteDependenciesConstraint, | ||
| DiscreteBatchConstraint, | ||
| ) | ||
|
|
||
| # Prevent (de-)serialization of custom constraints | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.