Skip to content

Schema-dump all user-created functions, not just 'public'#140

Closed
smudge wants to merge 3 commits into
teoljungberg:masterfrom
smudge:multi-schema-support
Closed

Schema-dump all user-created functions, not just 'public'#140
smudge wants to merge 3 commits into
teoljungberg:masterfrom
smudge:multi-schema-support

Conversation

@smudge
Copy link
Copy Markdown

@smudge smudge commented Dec 14, 2023

EDIT: see this comment for more recent changes to this PR!

Original Description:

For an application relying on schema.rb, this gem will dump only functions in the hardcoded public schema. This means that an application with its functions/tables in other pg schemas (and/or an application with a nondefault schema search path) may be able to create a function with create_function that does not survive the round-trip to schema.rb, in turn breaking the app after db:schema:load.

One workaround is to use structure.sql (which relies on pg_dump) to dump the functions in the other schemas. However, because the postgres search_path is accessible from within a query, this PR proposes using that value in this gem's query to dump all functions within the current search_path (and--see the last bullet below--actually matches the way structure.sql gets produced!). This allows the end-developer to continue using schema.rb while inserting functions into multiple PG schemas, and is backwards compatible with the prior = 'public' check (assuming public is in the search path, which it is by default).

Alternatives considered:

  • Filter functions not by namespace, but by owner, and only dump functions owned by the current user.
  • Look up grants and only dump (non-information-schema) functions granted to the current user (or to the PUBLIC role).
  • Use pg_dump (or Rails' code that executes pg_dump) and extract functions. However, the Rails implementation for structure.sql actually uses search_path to construct the --schema= argument for the command, which is how I came full circle back to using search_path in this PR. It actually matches what Rails does in its Ruby logic for producing structure.sql:
    args += search_path.split(",").map do |part|
      "--schema=#{part.strip}"
    end
    (source)

Comment thread lib/fx/adapters/postgres/functions.rb Outdated
Copy link
Copy Markdown
Owner

@teoljungberg teoljungberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review, holidays and vacations came between.

Thank you for contributing!

Comment thread lib/fx/adapters/postgres/functions.rb Outdated
Comment on lines +47 to +48
"name" => schema_aware_name(function),
"definition" => schema_aware_definition(function)
Copy link
Copy Markdown
Author

@smudge smudge Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the nudge @teoljungberg! I'm also a scenic user and so I dug into how that gem works, and ended up proposing this change on their end (which so far seems to be well-received, though I cannot say whether it will ultimately merge): scenic-views/scenic#401

In resolving some discrepancies within that gem, I ended up relying on the way that a core Rails contribution enabled schema-handling for enums: rails/rails#45740

I think it would make a ton of sense for both fx and scenic to behave similar to this core enum approach, which is roughly as follows:

  1. Query for all user-created structures that are accessible to the current PG user (regardless of search_path)
  2. If the structure's schema does not match current_schema(), include the fully-qualified name in schema.rb
  3. If the structure's schema does match current_schema(), do not include the schema in the generated schema.rb name.

Points 2 and 3 there are really crucial. This allows a schema.rb to be shared across contexts where the "default" search path might vary (and is entirely dependent on external/config factors -- it is not guaranteed to be public, and some PAAS providers do not even place this in the end-developer's control), while still allowing developers to explicitly specify non-default schemas and have those survive the round-trip to schema.rb.

Fx::Function.new(result)
end

def schema_aware_name(function)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered moving both of these "schema-aware" mappers to Fx::Function, but it also complicates the tests for that class in ways that didn't feel necessary. (I figured I would leave it up to you @teoljungberg, as to whether that class should be a "dumb" wrapper or a "smart" translator.)

end
end

def schema_aware_definition(function)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this method is necessary is because PG itself is not schema-aware when returning the function definition, and will, for example, include the fully-qualified name (public.something) even if current_schema() matches the function's schema.

This is probably the correct behavior on PG's part, but it means that for schema.rb purposes, we should strip it. (See here for the reasoning -- this how Rails handles enum names in a way that enables a single schema.rb to be valid & shareable in many contexts that may not be configured to have the same default PG schema.)

@smudge smudge changed the title Schema-dump all functions within search_path, not just 'public' Schema-dump all user-created functions, not just 'public' Jan 19, 2024
@smudge smudge requested a review from teoljungberg January 19, 2024 23:15
LEFT JOIN pg_aggregate pa
ON pa.aggfnoid = pp.oid
WHERE pn.nspname = 'public' AND pd.objid IS NULL
WHERE pn.nspname NOT IN ('pg_catalog', 'information_schema')
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some strategies go as far as ignoring everything in pg_*. I'm considering making that change still.

(Unfortunately I have not found any other way to isolate user-created functions, and there are many examples that take a similar approach to what I have done here.)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smudge I think you can use this (I stupidly branched and created a PR before I saw this one)

@teoljungberg
Copy link
Copy Markdown
Owner

Solved by #168

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants