Skip to content

JAVA-6194 Add MongoSocksProxyException for CMAP backpressure labeling#1968

Open
nhachicha wants to merge 42 commits into
mongodb:backpressurefrom
nhachicha:nh/backpressure/socks5_exception
Open

JAVA-6194 Add MongoSocksProxyException for CMAP backpressure labeling#1968
nhachicha wants to merge 42 commits into
mongodb:backpressurefrom
nhachicha:nh/backpressure/socks5_exception

Conversation

@nhachicha
Copy link
Copy Markdown
Collaborator

nhachicha and others added 10 commits March 5, 2026 13:34
…ABEL` (mongodb#1926)

This commit only adds the labels, and does not fully implement the tickets specified below.

The reason there are four JAVA tickets specified is that the is a single specification commit that resolved the four corresponding DRIVERS tickets. All of these JAVA tickets have to be done together.

The relevant spec changes:
- https://github.com/mongodb/specifications/blame/ba14b6bdc1dc695aa9cc20ccf9378592da1b2329/source/client-backpressure/client-backpressure.md#L52-L80 - it's a subset of [DRIVERS-3239, DRIVERS-3411, DRIVERS-3370, DRIVERS-3412: Client backpressure (mongodb#1907)](mongodb/specifications@1125200)

JAVA-5956, JAVA-6117, JAVA-6113, JAVA-6119
- Deprioritize sharded clusters on any error, all other topologies only on SystemOverloadedError.
- Pass ClusterType to updateCandidate so onAttemptFailure can distinguish topology types.
- Add retryable reads prose tests 3.1 and 3.2.
- Change ServerSelectionSelectionTest to use BaseCluster server selection chain.

JAVA-6105
JAVA-6021
JAVA-6074
---------
Co-authored-by: Valentin Kovalenko <[email protected]>
Co-authored-by: Ross Lawley <[email protected]>
- Add enableOverloadRetargeting boolean option to MongoClientSettings and ConnectionString to allow
  the driver to route requests to a different replica set member on retries when the previously
  used server is overloaded
- Add prose test 3.3 to verify that overload errors are retried on the same server when retargeting
  is disabled

JAVA-6167
---------

Co-authored-by: Ross Lawley <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a dedicated MongoSocksProxyException to represent SOCKS5 proxy connection/handshake failures (with phase + optional RFC 1928 reply code), and updates the SOCKS implementation/tests to throw and assert this new exception type. This supports more precise classification/handling of SOCKS-related connection failures (per JAVA-6194).

Changes:

  • Added MongoSocksProxyException (with HandshakePhase and optional proxy reply code).
  • Updated SocksSocket and SocketStream to throw/propagate MongoSocksProxyException for SOCKS negotiation/auth/connect failures and proxy TCP connect failures.
  • Updated/added tests to validate the new exception type and its phase/reply-code tagging.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
driver-sync/src/test/functional/com/mongodb/client/Socks5ProseTest.java Updates prose test assertions to expect MongoSocksProxyException during SOCKS auth failures.
driver-core/src/test/unit/com/mongodb/internal/connection/SocksSocketTest.java Adds unit tests validating handshake phase tagging and CONNECT reply-code tagging.
driver-core/src/main/com/mongodb/MongoSocksProxyException.java Adds new public exception type carrying SOCKS handshake phase and optional RFC 1928 reply code.
driver-core/src/main/com/mongodb/internal/connection/SocksSocket.java Replaces some SOCKS protocol failures with MongoSocksProxyException and adds helper to build a ServerAddress.
driver-core/src/main/com/mongodb/internal/connection/SocketStream.java Wraps proxy-enabled IO failures as MongoSocksProxyException and rethrows SOCKS exceptions directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread driver-core/src/test/unit/com/mongodb/internal/connection/SocksSocketTest.java Outdated
Comment thread driver-core/src/test/unit/com/mongodb/internal/connection/SocksSocketTest.java Outdated
@stIncMale stIncMale force-pushed the backpressure branch 2 times, most recently from 2d1b2e1 to 11a7b73 Compare May 8, 2026 20:58
nhachicha added 10 commits May 16, 2026 18:38
- Delegate single-arg constructor to four-arg variant
- Clarify Javadoc that proxyReplyCode may be null
- Remove duplicate blank lines between constructors
MongoSocksProxyException extends RuntimeException and bypassed the
existing catch (SocketException) block, leaking the underlying proxy
TCP socket on every SOCKS5 protocol failure (negotiation/auth/relay).
Defensive: avoids any reverse-DNS risk in error paths if the unresolved
invariant on remoteAddress is ever weakened.
- Close inner socket on failure inside initializeSocketOverSocksProxy
  and initializeSslSocketOverSocksProxy so that failures before this.socket
  is assigned do not leak the underlying file descriptor.
- Hoist translateInterruptedException out of the two orElseThrow branches.
Server thread now blocks on the client closing the connection instead
of guessing how long the SOCKS exchange will take. Removes a CI flake
under load.
Port 1 (tcpmux) is unassigned on most systems but not guaranteed.
Binding then releasing an ephemeral port gives a reliably-closed port.
The previous fix used InputStream.transferTo (Java 9+) and
OutputStream.nullOutputStream (Java 11+), which CI on Java 8 surfaced
as NoSuchMethodError. The test source set is compiled by the Groovy
compiler (because src/test/unit contains both Java and Groovy), which
does not honor options.release.set(8) — Java 11 API calls slipped past
compile-time validation but failed at link time on Java 8.

Replace with a plain read-into-discard-buffer loop. Same semantics
(blocks until client closes), Java 8 source compatible.
ApiAliasAndCompanionSpec discovers every public subtype of MongoException
in the Java driver and asserts each one has a corresponding Scala alias.
MongoSocksProxyException is new in 5.8 and was missing from the wrapper.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment thread driver-core/src/main/com/mongodb/MongoSocksProxyException.java
Comment thread driver-core/src/main/com/mongodb/internal/connection/SocketStream.java Outdated
Comment thread driver-core/src/main/com/mongodb/MongoSocksProxyException.java
Comment thread driver-core/src/main/com/mongodb/MongoSocksProxyException.java
Comment thread driver-core/src/main/com/mongodb/internal/connection/SocksSocket.java Outdated
…eler

JAVA-6194: SOCKS5 failures during post-TCP phases (negotiation, auth,
CONNECT-relay) are configuration/protocol errors and must not carry
SystemOverloadedError or RetryableError labels. Failures during the
PROXY_TCP_CONNECT phase are plain TCP-level reach failures (proxy host
unreachable / overloaded) and continue to receive the labels like any
other socket-open failure. Removes the corresponding TODO.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

driver-core/src/main/com/mongodb/internal/connection/SocksSocket.java:147

  • SocksSocket.connect only closes the underlying socket on MongoSocksProxyException and SocketException. A TCP connect timeout (SocketTimeoutException) is an IOException (not a SocketException), so it can escape this method without closing the socket, which risks leaking an FD for callers that don’t explicitly close on failure. Consider closing on any failed connect attempt (e.g., catch IOException around the whole body or use a failure finally).
        } catch (SocketException socketException) {
            /*
             * The 'close()' call here has two purposes:
             *
             * 1. Enforces self-closing under RFC 1928 if METHOD is X'FF'.

The per-phase try-blocks in SocksSocket.connect each had an explicit
catch (MongoSocksProxyException e) { throw e; } before the
catch (IOException e) wrapper. Since MongoSocksProxyException is a
RuntimeException, not an IOException, the IOException catch would
never have absorbed it anyway — the explicit re-throws are dead code
that Java's type system already enforces.

Remove the three redundant catches and extend the explanatory comment
to make the propagation path explicit: a MongoSocksProxyException
thrown directly by a phase method falls through to the outer block
that closes the socket.

No behavior change; SocksSocketTest (11) and BackpressureErrorLabelerTest
(49) both pass.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

The defensive null check on getHandshakePhase() and its rationale comment
predated commit dee79f0, which added notNull("handshakePhase", ...) to
all four MongoSocksProxyException constructors. Combined with the field
being private final, the phase reference is guaranteed non-null for any
instance created through the public API, making the labeler-side check
unreachable in normal use.

Collapse the body of isExcludedSocksPostTcpPhase to the bare phase
comparison; the constructor contract is the authoritative non-null
guarantee. Deserialization-driven null is not defended against here for
the same reason it is not defended against elsewhere in the driver
hierarchy: if it were a real concern, readObject on the exception itself
is the correct enforcement point, not a downstream labeler.

SocksSocketTest (11) and BackpressureErrorLabelerTest (49) both pass.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

nhachicha added 3 commits May 20, 2026 01:27
Backpressure labels (SystemOverloadedError, RetryableError) signal that
the target mongod is overloaded and the driver should back off. The
prior policy excluded post-TCP SOCKS5 phases (NEGOTIATION /
AUTHENTICATION / CONNECT_RELAY) and labeled PROXY_TCP_CONNECT on the
"structurally identical to a socket-open failure" rationale (98483bb,
b3da503). That rule conflated failure mechanism with semantic
attribution: backpressure is about WHICH party is overloaded, not WHAT
kind of failure surfaces.

Re-frame: a SOCKS5 failure is mongod-attributable only when the proxy
acted as a reach-proxy for mongod and reported a transport-level outcome
that mirrors a direct-connection socket-open failure. That is exactly
CONNECT_RELAY with RFC 1928 reply codes 3 (network unreachable),
4 (host unreachable), or 5 (connection refused) — the SOCKS5 analogs of
NoRouteToHostException and ConnectException.

Policy changes:
- PROXY_TCP_CONNECT flips from labeled to NOT labeled — the proxy itself
  is unreachable; mongod was never reached.
- CONNECT_RELAY with codes 3, 4, or 5 flips from NOT labeled to LABELED
  — these mirror direct-connection mongod-overload signals.
- NEGOTIATION, AUTHENTICATION, CONNECT_RELAY with codes 1, 2, 6, 7, 8,
  and CONNECT_RELAY with null replyCode (I/O failure / unrecognised
  reply) remain NOT labeled — proxy-side or ambiguous failures with no
  mongod-side attribution.

Rename isExcludedSocksPostTcpPhase -> isNonMongodAttributableSocksFailure
to reflect the actual semantic and add reply-code-aware logic. Update
MongoSocksProxyException class Javadoc and the labeler method Javadoc
to describe the attribution rule.

Test recalibration:
- socksProxyPostTcpPhaseShouldNotBeLabeled split and renamed:
  socksProxyNonMongodAttributableShouldNotBeLabeled (9 cases) and
  socksProxyMongodAttributableShouldBeLabeled (3 cases for codes 3/4/5).
- Old socksProxyTcpConnectPhaseShouldBeLabeled removed; PROXY_TCP_CONNECT
  is now covered as the first case of the non-attributable parameterised
  test.

This inverts the explicit Round 2 decision documented in 98483bb
and the matching class-level Javadoc updated in b3da503.
When SocketStream.open() wraps an initial-connect IOException as a
MongoSocksProxyException with PROXY_TCP_CONNECT, the message text was
"Exception connecting to SOCKS5 proxy" without identifying which proxy.
Operators reading a log line could not tell whether the ServerAddress
attached to the exception was the proxy or the target mongod.

Append the proxy host:port to the message string so the proxy endpoint
is identifiable from the message alone. Keep the ServerAddress as the
target mongod — SDAM, the connection pool, and retry logic key off the
cluster-member identity, and substituting the proxy endpoint there would
break those consumers (per the discussion on PR mongodb#1968 thread
r3267511517: message-string enrichment accepted, ServerAddress
substitution rejected).
The throw at SocksSocket.checkServerReply used reply.message verbatim
(e.g. "Host is unreachable") which is indistinguishable in raw log
output from a non-proxy ICMP-driven ConnectException carrying the
same text. Prefix the message with "SOCKS5 CONNECT reply:" and append
the RFC 1928 reply code so operators can attribute the failure to the
SOCKS5 protocol layer from the message alone.

Phrasing chosen to remain distinguishable from the sibling IO-failure
path message ("SOCKS5 CONNECT relay failed: ..." at SocksSocket.java
line 130): "reply" signals a parsed proxy response; "relay failed"
signals an IO failure during the CONNECT exchange. Two textual styles
let operators tell the two CONNECT_RELAY sub-cases apart.

reply.replyNumber remains available via getProxyReplyCode() for
programmatic consumers; embedding it in the message is purely a
diagnostic convenience for log readers.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment thread driver-core/src/main/com/mongodb/internal/connection/SocksSocket.java Outdated
nhachicha added 2 commits May 20, 2026 03:30
After 78d6b01 (per-phase IOException wrapping) and 27417eb
(removal of redundant re-throws), the two outer catches in
SocksSocket.connect handle disjoint, well-defined cases — but the
comments hadn't caught up.

- catch (MongoSocksProxyException) — now the destination for every
  SOCKS5 protocol failure, including the RFC 1928 X'FF' "no acceptable
  method" self-close which performNegotiation throws as
  MongoSocksProxyException directly (not as an IOException). Note this
  in the comment.

- catch (IOException) — now only ever fires for IOExceptions from the
  initial proxy TCP connect (timeout.checkedRun above); inner-phase
  IOExceptions are caught and rewrapped by the per-phase wrappers and
  land in the MongoSocksProxyException catch above. The stale comment
  block here referenced the RFC 1928 X'FF' path, which no longer
  reaches this catch. Replace with an accurate description of what
  actually lands here, and the leak-prevention rationale (JDK
  auto-close on connect timeout is implementation-defined).
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comment thread driver-core/src/test/unit/com/mongodb/internal/connection/SocksSocketTest.java Outdated
nhachicha added 3 commits May 20, 2026 05:19
The helper wrapped the SocksSocket try-with-resources in
catch (Exception ignored) {} followed by a trailing return null, which
silently absorbed any non-IOException, non-MongoSocksProxyException
failure (RuntimeException from SocksSocket internals, close()
failures, etc.) and returned null. assertProxy(null) then failed with
"Expected MongoSocksProxyException but got: null" instead of the
actual stack trace — masking real regressions.

Remove the catch and the trailing return null. Unexpected exceptions
now propagate with the original stack trace; the inner catch still
captures the expected MongoSocksProxyException / IOException as the
helper's return value. Move t.join(5000) into its own try with a
narrow InterruptedException catch that preserves the interrupt flag
without masking a primary exception that may already be unwinding.
Reviewers reading the four constructor overloads can mistake the
ordering for inconsistent because handshakePhase appears at a different
absolute position in each (third in the 3-arg ctor, fourth when cause
or proxyReplyCode is present, etc.). The ordering is actually rule-
driven: parent-class params first (message, address, optional cause),
then SOCKS-specific args (handshakePhase, optional proxyReplyCode).
Adding the rule to the class-level Javadoc closes the documentation
gap without restructuring the API — keeping parent-class symmetry on
the super(...) calls and avoiding a one-off divergence from the rest
of the MongoSocket*Exception hierarchy.
@nhachicha nhachicha requested a review from Copilot May 20, 2026 04:55
@nhachicha nhachicha marked this pull request as ready for review May 20, 2026 04:57
@nhachicha nhachicha requested a review from a team as a code owner May 20, 2026 04:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants