Skip to content

shim/streaming: return Stream on first bridge direction completion#167

Closed
eginez wants to merge 4 commits intocontainerd:mainfrom
eginez:fix-stream-return-on-first-direction
Closed

shim/streaming: return Stream on first bridge direction completion#167
eginez wants to merge 4 commits intocontainerd:mainfrom
eginez:fix-stream-return-on-first-direction

Conversation

@eginez
Copy link
Copy Markdown

@eginez eginez commented Apr 28, 2026

Summary

The shim's TTRPC streaming Stream handler waits for both bridge
directions (ttrpc->vm and vm->ttrpc) to finish before returning.
When the VM closes its side first — the common case for short
transfers — the ttrpc->vm goroutine remains parked on srv.Recv,
the server stream never closes, and the daemon's ReceiveStream
never observes EOF.

  • Return on the first direction's completion (was: wait for both).
  • The other direction unblocks via deferred vmConn.Close and
    ttrpc's per-handler context cancellation.

Why

Testing

  • New TestStreamReturnsOnVMClose — VM closes its side; handler
    must release the server stream within 3s. Times out without the
    fix.
  • New TestMultipleStreamsDoNotAccumulate — five sequential
    streams, each released by the VM, each must close promptly.
  • go test ./plugins/shim/streaming/... clean. (Some ./integration/...
    tests require libkrun.dylib and were skipped locally; CI covers
    those.)

The Stream handler waited for both bridge directions to finish. When
the VM closes its side first, the ttrpc->vm goroutine stays blocked
on srv.Recv; the server stream never closes and the daemon's
ReceiveStream never sees EOF.

Return on the first direction. The other unblocks via deferred
vmConn.Close and ttrpc's per-handler context cancellation.

Signed-off-by: Esteban Ginez <175813+eginez@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 28, 2026 18:00
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the shim’s TTRPC streaming Stream handler to return when either bridge direction completes, ensuring the server-side stream closes promptly (preventing daemon-side ReceiveStream hangs when the VM closes first).

Changes:

  • Change service.Stream to return after the first bridge direction completes (instead of waiting for both).
  • Add end-to-end tests validating stream closure on VM-side close and repeated sequential streams.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
plugins/shim/streaming/plugin.go Returns from Stream on first bridge completion to ensure the server stream closes and unblocks daemon ReceiveStream.
plugins/shim/streaming/plugin_test.go Adds regression/stress tests that reproduce the hang and validate prompt stream release.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/shim/streaming/plugin_test.go Outdated
Comment thread plugins/shim/streaming/plugin_test.go Outdated
Comment thread plugins/shim/streaming/plugin_test.go Outdated
Comment thread plugins/shim/streaming/plugin_test.go Outdated
- Drop unused ptypes import and the dead var _ = ptypes.Empty{} guard.
- Check MarshalAnyToProto error in TestMultipleStreamsDoNotAccumulate
  for consistency with TestStreamReturnsOnVMClose.
- Capture ttrpc.Server.Serve error and surface it on cleanup so test
  infra failures are not silently masked. Allow ErrServerClosed and
  net.ErrClosed as expected outcomes when the cleanup closes the
  server.

Signed-off-by: Esteban Ginez <175813+eginez@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/shim/streaming/plugin_test.go Outdated
Comment thread plugins/shim/streaming/plugin_test.go Outdated
Comment thread plugins/shim/streaming/plugin_test.go Outdated
- TestStreamReturnsOnVMClose: fail on non-EOF close errors instead of
  silently logging.
- TestMultipleStreamsDoNotAccumulate: assert each Recv returns an
  EOF-like close, not just any value.
- Use a per-iteration StreamInit ID to avoid masking ID-uniqueness
  bugs.

Signed-off-by: Esteban Ginez <175813+eginez@users.noreply.github.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread plugins/shim/streaming/plugin.go Outdated
@dmcgowan
Copy link
Copy Markdown
Member

I was able to implement some stress testing on the bidirection stream and this issue fixed one case, but regressed in another. I'll keep look into this one.

Signed-off-by: Esteban Ginez <175813+eginez@users.noreply.github.com>
@austinvazquez
Copy link
Copy Markdown
Member

Thanks @eginez for the work here. Superseded by #171

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants