Skip to content

reject structure synchronization from an incompatible upstream version#487

Open
danolivo wants to merge 4 commits into
mainfrom
spoc-571
Open

reject structure synchronization from an incompatible upstream version#487
danolivo wants to merge 4 commits into
mainfrom
spoc-571

Conversation

@danolivo
Copy link
Copy Markdown
Contributor

@danolivo danolivo commented Jun 1, 2026

Problem

When a subscription is created with synchronize_structure = true, spock copies the provider's schema by shelling out to pg_dump/pg_restore. Those binaries are pinned to the subscriber's own major version (via get_pg_executable()), and pg_dump cannot read a server of a newer major version than itself — it aborts with "aborting because of server version mismatch".

Until now that incompatibility surfaced badly:

  • At sync time it appeared only as an opaque pg_dump exit code wrapped in "could not execute pg_dump", and only after spock had already created a replication slot and exported a snapshot on the provider.
  • Because the condition is permanent, the apply worker would ERROR, die, and be restarted by the launcher — reconnecting to the publisher and failing again on every cycle, indefinitely.

What this PR does

Detects the version mismatch up front by mirroring pg_dump's own _check_database_version() test against the live connection, and enforces it at two points:

  • sub_create (foreground): rejects the operation immediately with a clear ERROR naming the provider's PostgreSQL version, before any slot/snapshot is created. This is the path most users hit.
  • Apply worker (backstop): for subscriptions that reach init another way, the worker disables the subscription (rather than ERROR-ing into a restart loop) and exits FATAL with an actionable hint. Disabling is what actually breaks the loop — the severity keyword alone would not.

Both call sites share a single predicate, upstream_version_supports_structure_sync(), so the "must track pg_dump's rule" logic lives in one place. An unknown (zero) version — a dead connection or a non-PostgreSQL endpoint — is treated as unsupported (fail closed).

Data-only syncs use COPY, which has no such version restriction, and are deliberately left untouched: the check is scoped to structure-bearing sync kinds only.

danolivo and others added 3 commits June 1, 2026 16:46
A walsender reports server_version like any other backend, so a
logical-replication connection does return a non-zero PQserverVersion().
Two comments claimed the opposite to justify using the SQL connection
for version detection.  The conclusion (use the SQL connection) is still
right -- it is held open anyway for the slot-reclaim and progress
queries -- but the stated reason was wrong.  Fix the comments only; no
behaviour change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A structure sync shells out to pg_dump/pg_restore, which get_pg_executable()
pins to the subscriber's own major version.  pg_dump cannot read a server of
a newer major version than itself and aborts with "server version mismatch";
previously that surfaced only as an opaque pg_dump exit code, after spock had
already created a replication slot and exported a snapshot.

Add upstream_version_supports_structure_sync(), mirroring pg_dump's own
_check_database_version() test, and enforce it in the apply worker before the
slot is created.  The mismatch is permanent, so a plain ERROR would make the
worker crash-loop and reconnect to the publisher on every restart; instead
disable the subscription -- as the nonrecoverable init steps already do -- and
FATAL with an actionable hint.  An unknown (zero) version is treated as
unsupported, failing closed.  Data-only syncs use COPY, which has no such
restriction, and are left untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Run the same version check in spock.create_subscription() when
synchronize_structure is requested, so the user gets an immediate, clear
error at DDL time instead of a background apply worker that disables itself
moments later.  Reuses upstream_version_supports_structure_sync() on the
connection already opened to fetch the provider's node info, and reports the
provider's version in the message.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@danolivo danolivo self-assigned this Jun 1, 2026
@danolivo danolivo added the bug Something isn't working label Jun 1, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 62a7bbe1-df00-4960-a889-ea3c842b8f7f

📥 Commits

Reviewing files that changed from the base of the PR and between 6446bc4 and fa3e982.

📒 Files selected for processing (1)
  • src/spock_sync.c
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/spock_sync.c

📝 Walkthrough

Walkthrough

This PR adds PostgreSQL version compatibility checks for structure synchronization in Spock. It introduces a helper function to validate upstream version support and deploys version checks at subscription creation and sync initialization. Minor inline comment updates clarify which SQL connection is used for version detection.

Changes

Structure Synchronization Version Compatibility

Layer / File(s) Summary
Version compatibility helper function
include/spock_sync.h, src/spock_sync.c
New upstream_version_supports_structure_sync(PGconn *origin_conn) validates that the upstream server's major version is within this build's supported range for pg_dump/pg_restore structure sync.
Subscription creation validation
src/spock_functions.c
spock_create_subscription now checks provider version support and rejects subscriptions with structure sync enabled when the provider version is incompatible, including the provider's server_version in the error.
Sync initialization guard
src/spock_sync.c
spock_sync_subscription initialization adds an upfront upstream version compatibility check that disables the subscription and raises a FATAL ERRCODE_FEATURE_NOT_SUPPORTED when structure sync is requested but the upstream is unsupported.
Slot creation documentation clarifications
src/spock_sync.c
Comments in ensure_replication_slot_snapshot() and spock_create_slot_and_read_progress() clarify that version detection is performed on the SQL connection used for subsequent slot/progress queries.

Poem

🐰 I nibble on versions, sniff the upstream air,
If pg_dump can't carry structures, I stop with care.
Before a subscription leaps or a sync begins,
I check the server_version for matching twins.
Hop safe, schema steady — no surprises there! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically describes the main change: rejecting structure synchronization when the upstream version is incompatible.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem, solution, and affected code paths in detail.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spoc-571

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Infer (1.2.0)
src/spock_sync.c

src/spock_sync.c:13:10: fatal error: 'postgres.h' file not found
13 | #include "postgres.h"
| ^~~~~~~~~~~~
1 error generated.
Error: the following clang command did not run successfully:
/opt/infer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/bin/clang-18
@/tmp/coderabbit-infer/fa3e98212dc0d05cd3ef8abac868de7f1645a38e-a6c6b1e749970934/tmp/clang_command_.tmp.1c7fda.txt
++Contents of '/tmp/coderabbit-infer/fa3e98212dc0d05cd3ef8abac868de7f1645a38e-a6c6b1e749970934/tmp/clang_command_.tmp.1c7fda.txt':
"-cc1" "-load"
"/opt/infer-linux-x86_64-v1.2.0/lib/infer/infer/bin/../../facebook-clang-plugins/libtooling/build/FacebookClangPlugin.dylib"
"-add-plugin" "BiniouASTExporter" "-plugin-arg-BiniouASTExporter" "-"
"-plugin-arg-BiniouASTExporter" "PREPEND_CURRENT_DIR=1"
"-plugin-arg-BiniouASTExporter" "MAX_STRING_SIZE=65535" "-cc1" "-triple"
"x86_64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "-disable-free"
"-clear-ast-

... [truncated 680 characters] ...

"/opt/infer-linux-x86_64-v1.2.0/lib/infer/facebook-clang-plugins/clang/install/lib/clang/18/include"
"-internal-isystem" "/usr/local/include" "-internal-isystem"
"/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
"-internal-externc-isystem" "/usr/include/x86_64-linux-gnu"
"-internal-externc-isystem" "/include" "-internal-externc-isystem"
"/usr/include" "-Wno-ignored-optimization-argument" "-Wno-everything"
"-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fskip-odr-check-in-gmf"
"-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o"
"/tmp/coderabbit-infer/a6c6b1e749970934/file.o" "-x" "c"
"src/spock_sync.c" "-O0" "-fno-builtin" "-include"
"/opt/infer-linux-x86_64-v1.2.0/lib/infer/infer/bin/../lib/clang_wrappers/global_defines.h"
"-Wno-everything"


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Jun 1, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/spock_sync.c`:
- Around line 333-356: The function upstream_version_supports_structure_sync
currently only rejects remoteversion==0 and newer majors; change it to mirror
pg_dump's inclusive [minRemoteVersion, maxRemoteVersion] check by computing the
remote server version (using PQserverVersion(origin_conn)) and rejecting if
remoteversion == 0 or remoteversion < minRemoteVersion or remoteversion >
maxRemoteVersion (i.e. implement the same bounds logic used by
_check_database_version()/pg_dump), so update
upstream_version_supports_structure_sync to validate both the lower and upper
bounds before returning true.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77c36fc3-e628-4b56-a8ba-60ae1c5218b5

📥 Commits

Reviewing files that changed from the base of the PR and between 5345184 and 6446bc4.

📒 Files selected for processing (3)
  • include/spock_sync.h
  • src/spock_functions.c
  • src/spock_sync.c

Comment thread src/spock_sync.c
@danolivo danolivo requested a review from mason-sharp June 2, 2026 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant