Skip to content

fix(transport): release flight-size on stream abort (#4345)#4393

Merged
sanity merged 5 commits into
mainfrom
fix/4345-flightsize
Jun 14, 2026
Merged

fix(transport): release flight-size on stream abort (#4345)#4393
sanity merged 5 commits into
mainfrom
fix/4345-flightsize

Conversation

@sanity

@sanity sanity commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Problem

Large multi-fragment GETs on lossy paths intermittently fail (stream assembly: no fragments received within inactivity timeout); the contract never caches and
a follow-up SUBSCRIBE is then rejected (issue #4345).

The remaining root cause this PR fixes: flight-size accounting strands credit
when a stream aborts.
Flight size is a single connection-wide counter. When an
outbound stream aborts (cwnd-wait timeout, upstream stall/error, or a mid-send
failure) its already-sent, unacked fragments stay counted in flight size until
each one is ACKed or ages out via MAX_PACKET_RETRANSMITS (~6s). FixedRate's
loss-pause caps cwnd at the frozen flight size, so a single lossy/aborted stream
starves every subsequent stream on that connection.

Already merged (not redone here): #4353 (abandon never-ACKed packets), #4367
(op-layer retry), #4374 (a stalled stream fails the stream, not the connection).
This PR adds the atomic flight-size release on stream abort.

Approach

Built and reviewed in two stages on this branch; merged as one PR (shipping the
per-packet stream index without the abort wiring would add hot-path cost for no
benefit).

Stage 1 — mechanism (commits 1–2): tag each tracked packet with its owning
stream (PacketStream::Stream(id) | Control sentinel) via a packet_streams
map, and add SentPacketTracker::drop_stream(stream_id) -> u64 that removes the
stream's packets from ALL tracking structures and returns the exact byte total.
Removal from pending_receipts is what makes it double-decrement-safe: no later
ACK or abandon can release those bytes again.

Stage 2 — wiring + the resend-gap race (commit 3):

  • The resend-gap race. send_stream/pipe_stream run in spawned tasks and
    share the Arc<Mutex<SentPacketTracker>> with the per-connection recv loop.
    The recv loop's resend cycle released the tracker lock across the UDP
    send_to().await: get_resend removed the packet from pending_receipts and
    the caller re-registered it only after the await. A drop_stream landing in
    that window saw no pending_receipts entry — released 0 bytes for a genuinely
    in-flight packet and stripped its stream tag — leaving the fragment pinned
    (partial defeat of the fix; safe direction, not a double-decrement).
    Fix: get_resend now KEEPS a resent packet in pending_receipts (clones
    the payload, refreshes its send-time in place, re-queues it) — the same
    keep-the-entry shape the TLP branch already used. The invariant becomes total:
    a packet is in pending_receipts iff it is in flight, so a concurrent
    drop_stream always sees and releases it. This does NOT touch flight size and
    does not reintroduce the forbidden decrement-on-RTO + re-add-on-resend pair —
    it is pure tracker bookkeeping. Resend re-registration becomes an idempotent
    in-place refresh; the recv loop is otherwise unchanged.
  • Abort wiring. New release_aborted_stream_flightsize() helper calls
    drop_stream under the tracker lock, drops the lock, then
    release_flightsize(returned) on the congestion controller — mirroring the
    ACK path's lock ordering (tracker lock never held across the controller call
    or across an await). Wired into all six stream-failing return sites in
    send_stream/pipe_stream. The two mid-send-failure sites additionally
    release the current fragment's on_send bytes (added just before a send that
    then failed, so the packet was never tracked).

drop_stream returns the encrypted tracker-payload byte sum — exactly what an
ACK or Abandon for those same packets would have released — so the abort
substitutes cleanly for the ACKs that never come, consistent with the existing
release accounting.

Testing

  • cwnd_wait_timeout_releases_stranded_flightsize (integration): reproduces the
    core symptom — fragments fill flight size to ~cwnd, no ACKs ever arrive, the
    next fragment's cwnd wait times out; asserts flight size returns to 0 on
    abort. Verified to FAIL without the wiring (flight size pinned at 2260B).
  • cwnd_wait_timeout_zero_inflight_is_noop (integration): abort with nothing in
    flight is a clean no-op.
  • test_drop_stream_releases_packet_out_for_resend (unit): the exact resend-gap
    interleaving — Resend handed out, NOT re-registered, then drop_stream must
    still release the packet.
  • Stage-1 unit suite (11 tests) pins the byte total and the no-double-decrement
    invariant (ACK / abandon / interleaved streams / control packets / boundaries).
  • test_packet_lost updated to the new keep-the-entry-on-resend semantics.

cargo fmt, cargo clippy -p freenet --tests -- -D warnings, and
cargo test -p freenet --lib all green (3035 lib tests pass; 654 transport
tests pass; 0 failures, no new ignores).

Part of #4345 (the flight-size angle; #4345 tracks the multi-fragment-GET
symptom across several fixes, so this does not fully close it).

[AI-assisted - Claude]

@sanity sanity changed the title feat(transport): stream-id tagging + drop_stream in SentPacketTracker (#4345 stage 1) fix(transport): release flight-size on stream abort (#4345) Jun 13, 2026
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Performance Benchmark Regressions Detected

Found 2 benchmark(s) with performance regressions:

  • streaming_buffer/latency/first_fragment_1kb: +71.718%
  • streaming_buffer/latency/first_fragment_full: +71.291%

⚠️ Important: This may be a false positive!

Common causes of false positives:

  1. Stale baseline: If recent PRs improved performance on main, this PR (which doesn't include those changes) will show as "regressed" when compared to the new baseline
  2. GitHub runner variance: Benchmarks run on shared ubuntu-latest runners with variable CPU contention
  3. Old baseline: The baseline might be from an older main commit if the cache restore used restore-keys fallback

To verify if this is a real regression:

  1. Check if recent commits on main touched transport or benchmark code
  2. Merge main into your branch and re-run benchmarks
  3. Review the baseline age in the "Download main branch baseline" step

This is informational only and does not block the PR.

View full benchmark results and summary

@sanity sanity left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comprehensive PR Review: #4393

Summary

  • PR: fix(transport): release flight size on stream abort (#4345)
  • Type: fix (transport / flight-size accounting / retransmit path)
  • Review tier: Full (high-risk surface: concurrency + transport)
  • Reviewers run: freenet:code-first, freenet:testing, freenet:skeptical, freenet:big-picture + external Codex (non-Claude)
  • CI: deterministic checks green; heavy jobs (NAT/Windows) were pending at review time — gate merge on green at the current HEAD.

This is a strong, well-tested implementation that faithfully follows the #4345 blueprint and is careful about lock ordering and the double-decrement hazard. The unit suite for drop_stream is genuinely exhaustive, and both headline regression tests were empirically verified fails-without/passes-with (a reviewer reverted each fix in isolation). It is not mergeable as-is because of one real concurrency defect that an external model and two perspectives independently found.


Must Fix (Blocking)

1. Resend re-registration can resurrect a packet that drop_stream already removed → double-release of flight size.
crates/core/src/transport/peer_connection.rs:1361 (and :1376) call report_sent_packet(idx, packet) after the tracker lock is released across send_to().await. report_sent_packet_inner (sent_packet_tracker.rs:~312) falls through to the insert branch when the id is absent and re-inserts the packet tagged Control. Interleaving (both tasks share the same Arc<Mutex<SentPacketTracker>>; the abort runs in the spawned send_stream/pipe_stream task, the resend in the recv loop):

  1. recv loop get_resend()Resend(idx,…), keeps the entry, drops the lock, awaits send_to.
  2. abort task drop_stream(stream_id) removes idx and release_flightsizes its bytes.
  3. recv loop resumes → report_sent_packet(idx) → insert branch → re-inserts idx as a Control zombie.
  4. Later ACK/abandon of the zombie releases its bytes a second time (flight-size under-count) and violates the PR's own "in pending_receipts iff in flight" invariant.

This is new to this PR (drop_stream didn't exist before, so the insert-branch couldn't collide with a concurrent removal). Direction is safe (under-count → admits more sends, never re-pins; saturating_sub floors at 0), which is why it's not a stall regression — but it re-opens the exact double-decrement class the design is built to prevent, and the "released exactly once" comments at peer_connection.rs:1354-1360 are stated unconditionally and are false in the race.

Found independently by: external Codex [P1], skeptical [Med], big-picture [Med], and pre-review.

Fix: add a resurrection-safe refresh_sent_packet(idx, payload, token) for the recv-loop resend path that updates only if the entry is still present and is a no-op (drops the payload) if drop_stream/Abandon already removed it. Do not change report_sent_packet_with_token globally — control-packet first sends legitimately need the insert path. Add a regression test for the get_resend → drop_stream → report_sent_packet ordering (the mirror of the existing test_drop_stream_releases_packet_out_for_resend, which stops at the await and never models the re-registration step).


Should Fix (Important)

  1. Test the other 5 abort sites' release. Only the send_stream cwnd-wait site asserts flightsize() == 0 (outbound_stream.rs:1342). The pipe_stream cwnd-wait/inactivity/upstream-error and both mid-send sites are tested for error-type only — deleting any of their release_aborted_stream_flightsize calls would fail no test. The controllers are already constructed in those tests; adding assert_eq!(controller.flightsize(), 0) is nearly free.
  2. Guard the mid-send double-release invariant. outbound_stream.rs:355-360 / :724-729 pair drop_stream with an explicit release_flightsize(packet_size), correct only because confirm_receipt = vec![] forces the single-packet path where a failed send never registers. A future change attaching receipts to fragment sends (the existing send_fragment multi-packet path) would silently create a double-release. Add a debug_assert/load-bearing comment that the failed fragment is absent from the tracker at the release site, plus a mid-send-failure test asserting flightsize() == 0 exactly once.
  3. Update the docs in this PR (repo rule: fix stale docs in the same PR). .claude/rules/transport.md:86-104 (flight-size release invariant) and docs/architecture/transport/README.md:247-250 list only ACK + abandon as release paths; add the new drop_stream-on-stream-abort path so a future agent doesn't reintroduce a double-decrement.
  4. Rebase before merge. Branch is 1 commit behind origin/main; the "−608 lines" nat_subscription* deletion in the diff is a stale-base artifact (added to main by #4390 after the branch point), not a real removal. Per the per-content review rule, re-confirm CI green after rebase.

Consider

  • Add a 2-stream simulation/unit test for the actual user-visible symptom (A aborts on cwnd-wait → B transfers on the same connection because flight size was released). The mechanism is unit-covered; this would encode the bug's purpose end-to-end.
  • Rustdoc on drop_stream: note the plaintext-in / on-wire-out flight-size skew is intentional and matches the ACK path (verified pre-existing + consistent), so nobody "fixes" one path and desyncs them.
  • One line at the abort sites: release_flightsize deliberately does not clear loss_pause_cwnd (only an ACK does); the symptom is still closed because flight size drops to ~0.
  • StreamId 100k-per-thread block overflow now also affects flight-size release (extends blast radius; practically unreachable).

Verified correct (skeptical + code-first)

on_timeout fires exactly once per RTO (distinct tracker vs controller methods); Karn's mark_retransmitted preserved; no duplicate resend_queue entries; total_packets_sent semantics change is benign (only a >0 gate); drop_stream sweeps all structures with ACK-after-drop / abandon-after-drop pinned as no-ops; lock never held across an await or the controller call; encrypted-vs-plaintext accounting is pre-existing and consistent; the modified test_packet_lost is a legitimate (stronger) semantics update, not a weakened assertion.

Verdict

Needs Changes — Re-review Required After Fix. One blocking concurrency defect (resurrection double-release, finding 1) plus Should-Fix test/doc gaps. The mechanism is otherwise sound and well-verified. HEAD reviewed: a3b2153e. Re-run the review on the fixed + rebased HEAD before merge (transport / concurrency = high-risk).

[AI-assisted - Claude]

@sanity sanity force-pushed the fix/4345-flightsize branch from a3b2153 to 6af7b9c Compare June 13, 2026 20:57
@sanity

sanity commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator Author

Stage-2 review fixes pushed (HEAD 6af7b9c6) — ready for re-review

Addressed the blocking defect and all should-fixes from the multi-model review, then rebased onto current main.

MUST-FIX — resend re-registration resurrection double-release

Root cause confirmed: the recv loop re-registered via the insert-capable report_sent_packet, which resurrected a packet that a concurrent drop_stream had already removed+released (a Control-tagged zombie), so a later ACK/abandon released its bytes a second time.

  • Added SentPacketTracker::refresh_sent_packet(idx, payload, token) -> bool: update-only, no-op (returns false) if the packet was already removed. The recv loop's two resend re-registration sites now use it. report_sent_packet* stay insert-capable for first sends / test re-registration.
  • Corrected the false "released exactly once / flight size unchanged" recv-loop comments to "only when still tracked".
  • Regression test test_refresh_after_drop_stream_does_not_resurrect (mirror of the out-for-resend test): re-register AFTER drop_stream, assert no resurrection + a later ACK and a later abandon each release ZERO. Verified it FAILS with a resurrect-on-absent revert. Plus test_report_sent_packet_resurrects_dropped_packet_footgun pinning why the recv loop must not use report_sent_packet.

SHOULD-FIX

  • flightsize()==0 asserts added after the task returns to the other 5 abort-site tests (pipe cwnd-wait / inactivity / inbound-error + send_stream & pipe mid-send failure); added pipe_stream_mid_send_failure_releases_flightsize. Deleting any release call now fails a test.
  • debug_assert (via SentPacketTracker::contains_packet) at both mid-send sites that the failed fragment is absent from the tracker — guards a future multi-packet wiring from a silent double-release; load-bearing comment explains the single-packet (confirm_receipt=[]) precondition.
  • Documented the drop_stream-on-stream-abort release path + the refresh_sent_packet-not-report_sent_packet rule in .claude/rules/transport.md (flight-size invariant) and docs/architecture/transport/README.md.

Verification

Rebased onto main (clean, no conflicts; the stale −608 nat_subscription artifact is gone). cargo fmt --check, cargo clippy --tests -D warnings, and cargo test -p freenet --lib all green on the rebased HEAD (3038 lib / 657 transport, 0 failures, 17 pre-existing ignores).

Still DRAFT — not merging. Re-review welcome on 6af7b9c6.

[AI-assisted - Claude]

sanity and others added 5 commits June 13, 2026 18:59
Flight size is a single connection-wide counter, but SentPacketTracker's
pending receipts are keyed only by PacketId with opaque encrypted payloads —
nothing maps an in-flight packet's bytes back to the stream that owns them.
So when an outbound stream aborts (e.g. a cwnd-wait timeout), its in-flight
fragments keep pinning flight size until each one is ACKed or ages out via
MAX_PACKET_RETRANSMITS (~6s). On a lossy/aborted stream that stranded credit
starves every subsequent stream on the connection (issue #4345).

This is stage 1 of the fix: a pure, isolated MECHANISM with no behaviour
change and no wiring into the abort paths yet.

- Tag each tracked packet with an owning stream via a new PacketStream enum
  (Stream(StreamId) | Control sentinel). A new packet_streams map holds the
  tag and, unlike pending_receipts, survives the get_resend pop so resend
  re-registration preserves a fragment's stream. The tag is dropped in
  lockstep with pending_receipts on ACK and on abandon.
- Add report_sent_stream_packet for first sends of StreamFragments;
  report_sent_packet[_with_token] preserve an existing tag (resend
  re-registration) and otherwise default to Control.
- Add SentPacketTracker::drop_stream(stream_id) -> u64 that removes a stream's
  packets from ALL tracking structures and returns the total bytes removed.
  Because the packets are gone from pending_receipts, no later ACK or abandon
  can release their bytes again — drop_stream is double-decrement-safe.
- Thread the owning stream through packet_sending: the four StreamFragment
  send sites pass Stream(id); noop / short-message / handshake sites pass
  Control. Resend re-registration and handshake sends keep their existing
  Control-default entry points.

drop_stream is #[allow(dead_code)] in this stage; it is wired into the
outbound-stream abort paths in stage 2, after the double-decrement safety is
reviewed in isolation.

Refs #4345

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pin the drop_stream mechanism and, above all, its no-double-decrement
invariant, in isolation before any abort-path wiring:

- exact byte total of the dropped stream's packets
- ACK after drop_stream is a no-op (no second flight-size release)
- abandon path after drop_stream is a no-op (dropped packets never resend)
- interleaved streams A/B: drop_stream(A) leaves B's accounting intact
- control (non-stream) packets are untouched by drop_stream
- unknown / already-drained stream id -> 0, idempotent, no panic
- boundaries: single packet, 500 packets, zero-length packet
- sweeps a packet that lives in BOTH pending and the resend machinery
  (and asserts resend re-registration preserves the stream tag)
- an already-abandoned packet is not re-released by a later drop_stream

All run on VirtualTime via the existing mock_sent_packet_tracker harness.

Refs #4345

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Stage 2 of the #4345 flight-size fix: wire drop_stream into the outbound
stream abort paths so an aborted stream's in-flight bytes are released
immediately, and close the resend-gap race that wiring exposed.

THE PROBLEM
Flight size is a single connection-wide counter. When an outbound stream
aborts (cwnd-wait timeout, upstream stall/error, mid-send failure) its
already-sent, unacked fragments stay counted in flight size until each one is
ACKed or ages out via MAX_PACKET_RETRANSMITS (~6s). FixedRate's loss-pause
caps cwnd at the frozen flight size, so a single aborted stream starves every
subsequent stream on the connection — the #4345 "cwnd wait timeout / no
fragments received" failure for large multi-fragment GETs on lossy paths.

THE RESEND-GAP RACE (the crux of stage-2 correctness)
send_stream / pipe_stream run in spawned tasks and share the
Arc<Mutex<SentPacketTracker>> with the per-connection recv loop. The recv
loop's resend cycle released the tracker lock across the UDP send_to().await:
get_resend removed the packet from pending_receipts, and the caller
re-registered it only after the await. A drop_stream landing in that window
saw no pending_receipts entry — released 0 bytes for a genuinely in-flight
packet AND stripped its stream tag — leaving the fragment pinned in flight
size (a partial defeat of the fix; safe direction, not a double-decrement).

THE FIX
- get_resend now KEEPS a resent packet in pending_receipts (clones the payload
  for the caller, refreshes its send-time in place, re-queues it) — the same
  keep-the-entry shape the TLP branch already used. The invariant becomes
  total: a packet is in pending_receipts iff it is in flight, so a concurrent
  drop_stream always sees and releases it. This does NOT touch flight size and
  does not reintroduce the forbidden decrement-on-RTO + re-add-on-resend pair
  (the congestion controller is untouched; this is pure tracker bookkeeping).
- Resend re-registration via report_sent_packet is now an idempotent in-place
  refresh for an already-tracked packet (no duplicate resend-queue entry, no
  total_packets_sent bump, stream tag preserved). The recv loop is otherwise
  unchanged.
- New release_aborted_stream_flightsize() helper calls drop_stream under the
  tracker lock, drops the lock, then release_flightsize(returned) on the
  congestion controller — mirroring the ACK path's lock ordering (tracker lock
  never held across the controller call, never across an await). Wired into all
  six stream-failing return sites in send_stream and pipe_stream. The two
  mid-send-failure sites additionally release the current fragment's on_send
  bytes, which were added to flight size just before a send that then failed so
  the packet was never tracked.
- drop_stream loses its stage-1 #[allow(dead_code)] now that it is called.

Drop_stream returns the encrypted tracker-payload byte sum, matching exactly
what an ACK or Abandon for those same packets would have released — so the
abort substitutes cleanly for the ACKs that will never come, consistent with
the existing release accounting.

TESTS
- cwnd_wait_timeout_releases_stranded_flightsize: reproduces the core symptom —
  fragments fill flight size to ~cwnd, no ACKs, the next fragment's cwnd wait
  times out; asserts flight size returns to 0 on abort. Verified to FAIL
  without the wiring (flight size pinned at 2260B).
- cwnd_wait_timeout_zero_inflight_is_noop: abort with nothing in flight → no-op.
- test_drop_stream_releases_packet_out_for_resend: the exact resend-gap
  interleaving — Resend handed out, NOT re-registered, then drop_stream must
  still release the packet.
- test_packet_lost updated to the new keep-the-entry-on-resend semantics.

Part of #4345

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ase (#4345)

Addresses the stage-2 multi-model review.

MUST-FIX (blocking) — resend re-registration could resurrect a drop_stream'd
packet, double-releasing its flight-size bytes:

  1. recv loop get_resend() returns Resend(idx,…), keeps the entry, releases the
     tracker lock, awaits send_to().
  2. the spawned abort task runs drop_stream(stream_id), removing idx and
     release_flightsize-ing its bytes.
  3. recv loop resumes → report_sent_packet(idx) → report_sent_packet_inner finds
     idx ABSENT → INSERT branch → re-inserts idx tagged Control (a zombie).
  4. a later ACK/abandon of the zombie releases its bytes a SECOND time
     (flight-size under-count) and breaks "in pending_receipts iff in flight".

Fix: add SentPacketTracker::refresh_sent_packet(idx, payload, token) -> bool,
which updates the entry ONLY IF still present and is a no-op (drops the payload,
returns false) if drop_stream/Abandon already removed it. The recv loop's two
resend re-registration sites now use it instead of report_sent_packet. The
insert-capable report_sent_packet* are unchanged (control-packet first sends and
test re-registration legitimately need insert). Corrected the now-false
"released exactly once / flight size unchanged" recv-loop comments to
"only when the packet is still tracked".

Regression tests:
- test_refresh_after_drop_stream_does_not_resurrect: the MIRROR of
  test_drop_stream_releases_packet_out_for_resend — re-register AFTER drop_stream
  and assert (a) idx is NOT resurrected, (b) a later ACK and a later abandon each
  release ZERO. Verified to FAIL with a resurrect-on-absent revert.
- test_report_sent_packet_resurrects_dropped_packet_footgun: pins WHY the recv
  loop must not use report_sent_packet (documents the insert-resurrect footgun).

SHOULD-FIX:
- Added flightsize()==0 asserts after the task returns to the other 5 abort-site
  tests (pipe_stream cwnd-wait / inactivity / inbound-error + send_stream and
  pipe_stream mid-send failure), so deleting any release call now fails a test.
  Added pipe_stream_mid_send_failure_releases_flightsize.
- The two mid-send double-release sites are correct only because confirm_receipt
  is empty (single-packet path → a failed send never registers). Added a
  debug_assert (via SentPacketTracker::contains_packet) at both sites that the
  failed fragment is absent from the tracker, guarding a future multi-packet
  wiring from silently creating a double-release.
- Documented the drop_stream-on-stream-abort release path in
  .claude/rules/transport.md (flight-size release invariant) and
  docs/architecture/transport/README.md — both previously listed only ACK +
  abandon. Added the refresh_sent_packet-not-report_sent_packet rule.

Part of #4345

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…4345)

Final stage-2 review polish (no behavior change):

- drop_stream rustdoc: state explicitly that summing the ENCRYPTED stored
  payload length (not plaintext on_send bytes) is INTENTIONAL and matches the
  ACK and Abandon paths — all three release the same encrypted byte count for a
  given packet, so drop_stream releases exactly what those packets' ACK/Abandon
  would have. An external reviewer read this as an over-release bug; it is not.
  Note that the plaintext-add/encrypted-release asymmetry is a pre-existing
  convention across all three paths (tracked in #4402, out
  of scope here) and that a future change must not "fix" one path in isolation.

- recv-loop refresh_sent_packet sites: make the discarded bool explicit. The
  success site now binds it and trace-logs the deliberate no-op-on-drop (a
  resurrection-safe path, not an error); the error site uses `let _ =` with a
  comment. Makes the intent obvious to readers.

Part of #4345

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@sanity sanity force-pushed the fix/4345-flightsize branch from 6af7b9c to de2689f Compare June 14, 2026 00:03
@sanity

sanity commented Jun 14, 2026

Copy link
Copy Markdown
Collaborator Author

Final review polish pushed (HEAD de2689fd) — re-rebased onto current main

Two non-behavioral items from the re-review, then a fresh rebase:

  1. drop_stream rustdoc now states explicitly that summing the ENCRYPTED stored-payload length (not plaintext on_send bytes) is intentional and matches the ACK and Abandon paths — drop_stream releases exactly what those packets' ACK/Abandon would have, so it is not an over-release. Notes that the plaintext-add/encrypted-release asymmetry is a pre-existing convention across all three paths (tracked in Transport flight-size accounting mixes plaintext (on_send) and encrypted (release) byte counts #4402, out of scope here) and that a future change must not "fix" one path in isolation.
  2. recv-loop refresh_sent_packet sites: the discarded bool is now explicit — the success site binds it and trace-logs the deliberate no-op-on-drop (resurrection-safe path, not an error); the error site uses let _ = with a comment.

Rebase: clean fast-forward rebase onto current origin/main (no conflicts; #4396/#4397/#4398 don't touch transport). git diff origin/main is still only the 5 transport/doc files. Branch is 5 ahead / 0 behind.

Verification on de2689fd: cargo fmt --check, cargo clippy -p freenet --tests -- -D warnings, and cargo test -p freenet --lib all green (3042 lib passed / 0 failed / 17 pre-existing ignores).

Still DRAFT — not merging. Ready for the clean external pass + CI-green check on this exact SHA.

[AI-assisted - Claude]

@sanity sanity marked this pull request as ready for review June 14, 2026 00:22
@sanity sanity enabled auto-merge June 14, 2026 00:22
@github-actions

Copy link
Copy Markdown
Contributor

I have all the information I need. Here is my review:


Rule Review: No blocking issues; two minor style notes

Rules checked: transport.md, testing.md, code-style.md, git-workflow.md
Files reviewed: 5 (.claude/rules/transport.md, peer_connection.rs, outbound_stream.rs, sent_packet_tracker.rs, docs/architecture/transport/README.md)


Warnings

None.


Info

  • crates/core/src/transport/sent_packet_tracker.rs:810.expect() used in production code (inside get_resend, the non-test resend branch). The invariant is sound — contains_key was checked 3 lines above and &mut self ensures no interleaving — but the code-style rule says "explicit match/if-let, never .unwrap()" for production paths. A match … { None => unreachable!() } or splitting into a get_mut with a documented unreachable! would satisfy the letter of the rule. (rule: code-style.md)

  • crates/core/src/transport/sent_packet_tracker.rs (test test_drop_stream_sweeps_retransmitted_and_pending, re-registration comment) — The inline comment says "Re-register exactly as production does (Control-default path)" but the actual path taken is the stream-tag-preserved path: report_sent_packet_with_token finds PacketStream::Stream(stream) in packet_streams (the packet is still in pending_receipts under the new keep-the-entry semantics) and report_sent_packet_inner hits the early-refresh branch, leaving the tag intact. The "Control-default" label is misleading; neither the default nor a new Control insert fires. The assertion below still passes because the tag is preserved — just via a different branch than the comment implies. Worth correcting to avoid future confusion. (rule: code-style.md)


Summary: The fix correctly implements the three-release-path flight-size invariant (ACK / abandon / drop_stream), the resurrection-safety guard (refresh_sent_packet vs. insert-capable report_sent_packet), and the keep-the-entry-across-resend invariant. Test coverage is thorough: the primary regression (cwnd_wait_timeout_releases_stranded_flightsize), zero-in-flight edge case, mid-send failure paths for both send_stream and pipe_stream, the resend-gap race, and the double-release no-op proof are all present. Documentation and rule files are updated in-step. No blocking issues.


Rule review against .claude/rules/. WARNING findings block merge.

@sanity sanity added this pull request to the merge queue Jun 14, 2026
Merged via the queue into main with commit 71bfe04 Jun 14, 2026
20 checks passed
@sanity sanity deleted the fix/4345-flightsize branch June 14, 2026 00:38
iduartgomez added a commit that referenced this pull request Jun 14, 2026
…an scaffolding

streaming-infrastructure.md had drifted from main on several counts and was
framed as a phase-by-phase IMPLEMENTATION PLAN even though streaming has
shipped and is active. This:

- Removes the stale plan scaffolding: the Implementation Status (Phase 1-6)
  table and the Phase 5 (Capability Negotiation) / Phase 6 (Rollout) NOT-STARTED
  sections. Streaming is live and threshold-gated, so a capability-negotiation
  + shadow-rollout plan no longer reflects reality. De-'Phase N's the component
  headings so the file reads as a design reference, not a roadmap.
- Drops the obsolete `streaming_enabled: bool` config (removed from production
  code); the sole gate is `streaming_threshold` (default 64KB) via
  `operations::should_use_streaming`. Documents that.
- Corrects the constants (ORPHAN_STREAM_TIMEOUT / STREAM_CLAIM_TIMEOUT 30s/10s
  -> 60s), the buffer type (OnceLock<Bytes> -> AtomicPtr<Bytes>), and adds the
  flight-size-release-on-abort (#4345/#4393) and OutboundStreamFailed (#4374)
  sections, with Phase-4 handler citations pointing at the real op_ctx_task.rs
  drivers (not the enum-def modules).
- README: drops the stale '(Phase 1)' tag from the streaming row.

Documents already-merged #4345 / #4393 / #4374 / #1454 / #4307. No behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant