Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Inducing Point Selection to Use Allocator Classes for Enhanced Flexibility (#377) #435

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

yalsaffar
Copy link
Contributor

Description

This PR proposes a solution to issue #378 . Although the implementation is fully functional and tested, it is presented as an overall proposal, open for discussion and potential refinement.

Key Changes

  • Refactor of select_inducing_points Function:

    • Previous Implementation: Accepted method as a string ("pivoted_chol", "kmeans++", "auto", "sobol") and used conditional logic to select inducing points.
    • New Implementation: Now accepts an InducingPointAllocator instance, with AutoAllocator as the default if allocator is None. This approach allows users to directly pass allocator instances, aligning with the issue’s goal to enable flexible use of custom allocators like GreedyImprovementReduction.
  • New inducing_point_allocators.py File:

    • Introduces classes SobolAllocator, KMeansAllocator, and AutoAllocator, all implementing the InducingPointAllocator interface from Botorch. This modularizes allocator logic, moving it out of select_inducing_points while following the established base class structure.
  • Modifications to Models and Example Files:

    • Updated gp_classification.py, monotonic_projection_gp.py, monotonic_rejection_gp.py, semi_p.py, and example_problems.py to handle allocator class instances rather than string-based methods, improving overall consistency.
    • Added imports for the new allocator classes in __init__.py for cross-codebase accessibility.
  • Updated Tests:

    • Adjusted tests in test_semi_p.py, test_utils.py, and test_config.py to work with allocator classes instead of the previous string-based structure.

Additional Notes

This PR preserves most of the existing logic in select_inducing_points to keep changes minimal. I know further work is needed to confirm compatibility with additional Botorch allocators and to support advanced configurations using from_config for custom allocator setups. I’d love to hear your feedback on the overall approach before moving forward with these additional refinements.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 4, 2024
@yalsaffar yalsaffar changed the title Proposal: Refactor Inducing Point Selection to Use Allocator Classes for Enhanced Flexibility (#377) Refactor Inducing Point Selection to Use Allocator Classes for Enhanced Flexibility (#377) Nov 4, 2024
@facebook-github-bot
Copy link
Contributor

@JasonKChow has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@crasanders crasanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good other than a few nits. Our internal tests indicated merge conflicts, so you probably need to rebase.


def allocate_inducing_points(
self,
bounds: Optional[torch.Tensor],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why mark it as Optional if you have to assert that it's not None anyway?

aepsych/models/inducing_point_allocators.py Outdated Show resolved Hide resolved
aepsych/models/inducing_point_allocators.py Outdated Show resolved Hide resolved
ValueError: If `bounds` is not provided.
"""
# Ensure bounds are provided
if bounds is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per comment above, would this check be necessary if bounds weren't optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with that, but I’ve encountered an issue where, even though line 84 asserts that bounds is provided for SobolAllocator, mypy still doesn’t recognize it. There are two ways to work around this:

One option is to add bounds=bounds # type: ignore, and another is to use a dummy condition like this: bounds=bounds if bounds is not None else torch.empty(0). Alternatively, keeping bounds as optional in SobolAllocator and asserting again is another way to handle it.

If there are other approaches to overcome this error, please let me know.

aepsych/models/utils.py Outdated Show resolved Hide resolved
inducing_point_method = config.get(
classname, "inducing_point_method", fallback="auto"
inducing_point_method = config.getobj(
classname, "inducing_point_method", fallback=AutoAllocator()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to change this to

inducing_point_method_class = config.getobj(
            classname, "inducing_point_method", fallback=AutoAllocator
)
inducing_point_method = inducing_point_method_class()


def allocate_inducing_points(
self,
bounds: torch.Tensor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of adding bounds as an argument here, we add it to the constructor, and when we create the inducing points we use self.bounds?

@@ -1017,7 +1018,7 @@ def test_semip_config(self):
self.assertTrue(model.dim == 2)
self.assertTrue(model.inducing_size == 10)
self.assertTrue(model.stim_dim == 1)
self.assertTrue(model.inducing_point_method == "sobol")
self.assertTrue(model.inducing_point_method == SobolAllocator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make a new file for InducingPointAllocator tests and check that each class can be correctly read from a config with its arguments (if any).

return inducing_points

@classmethod
def from_config(cls, config:Config) -> 'SobolAllocator':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this more, and I think we should have the InducingPointAllocators read from configs themselves rather than receive inputs from the model. I think practically what this will look like is we need to register all of our InducingPointAllocators to work with configs, as well as the ones from botorch (see how we do this for botorch's acquisition functions). In the model's from_config, we'll get the InducingPointAllocator class and check if it has its own from_config method. If it does, we call it, if it doesn't, we can still try to instantiate it without any arguments. We could have some logic to try to infer what arguments should go into the constructor like we do for acquisition functions, but that code is pretty messy, so I don't know if that's a good idea.

To maintain backwards compatibility, we can check if inducing_point_method is a string, and if it is, pass that string to select_inducing_points, but add a deprecation warning to that function that advises that you use the InducingPointAllocator classes directly.

@yalsaffar
Copy link
Contributor Author

Changes

  1. Config-Driven Initialization for Allocators:

    • Updated inducing_point_allocators to read parameters directly from the configuration file. Custom allocators (SobolAllocator, KMeansAllocator, AutoAllocator) now have a from_config method to allow instantiation with config-defined parameters.
    • For BoTorch allocators without a from_config method, the system attempts instantiation without additional parameters(similar to acquisitions).
  2. Enhanced Model from_config Handling:

    • Models now check if an allocator class has a from_config method and use it if available, streamlining configuration-based initialization.
    • Maintains backward compatibility: if inducing_point_method is provided as a string, it falls back to using the legacy approach in the select_inducing_points method.
  3. Updated select_inducing_points Function:

    • select_inducing_points now supports both config-based allocator classes and string-based legacy methods ("sobol", "auto", etc.). When a legacy method is used, a DeprecationWarning is raised, guiding users to switch to the allocator classes.
  4. Testing Updates:

    • Updated tests in test_semi_p_config within test_config to work with the new allocator structure.
    • Added test_points_allocators.py to validate:
      • Config-driven instantiation of allocators.
      • Bounds handling.
      • Legacy method support with deprecation warnings.
      • Compatibility with BoTorch allocators (GreedyVarianceReduction, GreedyImprovementReduction).

@JasonKChow
Copy link
Contributor

A new style for defining initializations from config just went in. Moving forward, we want to ensure all objects initialized to use the new ConfigurableMixin. Can you update to match the new spec?

ConfigurableMixin got updated here:
https://github.com/facebookresearch/aepsych/blob/main/aepsych/config.py#L430

You can see how I implemented it in parameters here:
https://github.com/facebookresearch/aepsych/blob/main/aepsych/transforms/parameters.py#L98

@facebook-github-bot
Copy link
Contributor

@JasonKChow has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@JasonKChow has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@JasonKChow has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

"""
if name is None:
name = cls.__name__
bounds = config.gettensor(name, "bounds")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually don't have bounds explicitly in the config, so we should construct them from lb/ub.

Copy link
Contributor

@JasonKChow JasonKChow Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a utility function went in recently that'll produce 2D bounds. It should be in the root utils.py

(Or maybe it's still in transforms since the reactor PR hasn't been merged)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a problem. Find the get_bounds function in transform and use it here. Ensure a test covers any case where this would break if get_bounds moved.

"""
if name is None:
name = cls.__name__
bounds = config.gettensor(name, "bounds")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still a problem. Find the get_bounds function in transform and use it here. Ensure a test covers any case where this would break if get_bounds moved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants