Skip to content

Issue 2174 proposed fix#2759

Open
CyanideDragon wants to merge 3 commits into
ash-project:mainfrom
CyanideDragon:main
Open

Issue 2174 proposed fix#2759
CyanideDragon wants to merge 3 commits into
ash-project:mainfrom
CyanideDragon:main

Conversation

@CyanideDragon

Copy link
Copy Markdown

Contributor checklist

Leave anything that you believe does not apply unchecked.

  • I accept the AI Policy, or AI was not used in the creation of this PR.
  • Bug fixes include regression tests
  • Chores
  • Documentation changes
  • Features include unit/acceptance tests
  • Refactoring
  • Update dependencies

Addressing issue 2174: before_action hooks cannot add loads or calculations during read actions

The source of the issues was two separate bugs within the pipeline timing.

Bug 1 was having post-read using a stale query. It saw the outer query and primarily used the outer query. The new_query would be made, but the read action would never actually load the new query and would be silently dropped.

Bug 2 related to splitting calculations too early. When calculations are split, they are separated into 3 their 3 buckets. However, this was done before the before_action. This caused an issue where if the hook adds a new calculation, it would land on on query.calculations but never get merged with the existing calculation lists.

To fix this, "hook just ran" and "execute query/post-read" paths to reprocess calculations added by the hook, refresh calculation context, re-run forbidden-field deselection, and finally append the new data-layer calculation. split_and_load_calculations are used earlier in the pipeline to ensure the data is not dropped. Take only new_entries in calculations_in_query, hydrate them, and then append to data_layer_calculations with proper authorization on expressions to allow hook-added calculations without re-processing calculations handled pre-hook.

Next was adding the post-read hook-added loads without replacing the entire query. Required some helper functions to not break normal reads. Some post-read pipeline was cleaned up; loads, calculations, and field authorization were split from pagination, to better compare hook changes and what was ran against the DB.

Tests were added to prove running and to test against regression.

@zachdaniel

Copy link
Copy Markdown
Contributor

This seems a bit complex of a change honestly. I would have expected that to do this we would essentially need to move some logic from after the before_action hooks to before. Additionally, attempting to deduplicate loads using something like this:

https://github.com/ash-project/ash/pull/2759/changes#diff-b7b7a5528d3683c2b41b27b34431108ab641369372ec60ac1c908fa81beb04ffR3763

Will almost certainly bite us at some point, vs just checking if a given thing is loaded or splitting the loads somewhere earlier etc.

@CyanideDragon

Copy link
Copy Markdown
Author

@zachdaniel

After going over my notes and combing through the code again, I tried to make it without changing too much of the normal functionality. Please read below for my evaluations on what could be done. High risk is something I believe should be avoided outright. Low risk I believe should certainly be considered, but maybe too risky to implement in this PR and saved for a different one.

High risk:
run_before_action can be moved before authorize_query. This would make hook-added loads and calculations participate in authorization. The current method of hooks runs on a post-auth query. Any apps that rely on the current order would need to make their own changes.

High risk:
We could also move in the opposite direction with moving all calculation splitting after before_action, however, this would lead to a massive refactor. The model would be cleaner however. one split, one authorize pass. Everything that depends on calculation lists (hydrate, relationship_path_filters, authorize_sorts, etc.) would need to be moved to after hooks.

Fix:
I did also check on the deduplication. I could implement something like a before_action_load_diff. The deduplicate I realized, two loads with the same key can still differ in nested filters/selects. a dedupe by name could end up dropping the authorized version.

Low risk:
Inline or fold process_before_action calculations into apply_before_action_changes. It will be a thin wrapper. However, we would need to ensure that do not regress to the original bug of hook calc disappearing again, as the wrapper appends, not replaces. If map_size(query.calculations) == 0 then we skip work when the hook didn't add calcs. Removing this for the inline runs split_and_load_calculations unnecessarily. This is usually harmless, but we will get noise that could interact unexpectedly with a non-empty map. calculations.ex owns the split logic while read.ex orchestrates. This inline change would mean moving the body into read.ex. This brings an organizational risk where future calculation changes become easier to miss.

Low risk:
Inlining could also allow us to drop missing_pkeys?/2 helper. If we were to inline at line 3806 and delete the helper then this actually cause two duplications. we create two copies, delete the helper, and now we still have two copies. This can cause issues with wrong calc split behavior such as can_expression_calculation?. The helper would use query.resource and opts[:initial_data]. Inlining with the wrong query, such as outer vs post-hook, could miscompute expression-calc eligibility.

Low risk:
Extract shared "refresh calc contexts" helper. This removes duplicated add_calc_context mapping in 3 places. This one I would be most careful about, as the 3 blocks are not identical. (lines ~338-382, ~627-666, and ~3821-3858.) We can have another reintroduction of the original bug with stale queries. All three refresh calcs happen after updating the query because the hook may have changed tenant/context. Helper must take the post-hook query, not opts from before. This means that we must maintain the order of operations of refreshing calc contexts -> deselect_known_forbidden_fields in apply_before_action_changes. This helper needs to be structured to not reorder. Additionally, if we make it one helper with many operational kwargs, it can hide intentional difference between call sites.

My suggestion would be for me to go back and use a diff instead of a dedupe, and then the low-risk issues could be attempted as a separate PR issue.

@zachdaniel

Copy link
Copy Markdown
Contributor

Hello! I think that this implementation causes too many issues and that there isn't really a good way to do this. Instead what I think we should do is add a simple validation that query_after_before_action.load == query.load, i.e asserting that the load has been unchanged. We can then log a warning indicating that you cannot add load statements in before_action hooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants