Skip to content

refactor(blk): offload virtqueue drain to worker thread#95

Open
agicy wants to merge 1 commit into
syswonder:mainfrom
agicy:refactor-blk-worker
Open

refactor(blk): offload virtqueue drain to worker thread#95
agicy wants to merge 1 commit into
syswonder:mainfrom
agicy:refactor-blk-worker

Conversation

@agicy

@agicy agicy commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR refactors the virtio-blk worker threading model to eliminate the intermediate producer-consumer queue between the main thread and the I/O worker thread.

Before

Guest kicks virtqueue
       ↓
  main thread (epoll)
       ↓
  parse descriptors → push to procq (TAILQ, mutex-protected)
       ↓
  worker thread → pop from procq → perform I/O → update used ring → inject IRQ
  • The main thread parsed every descriptor chain and pushed blkp_req structs onto a cross-thread queue (procq).
  • The worker thread popped from the queue, performed disk I/O, and injected IRQs.
  • Both threads contended on procq's mutex for every single I/O request.

After

Guest kicks virtqueue
       ↓
  main thread → signal condvar (no virtqueue access)
       ↓
  worker thread → drain virtqueue directly → perform I/O → update used ring → inject IRQ
  • The main thread only signals the worker via pthread_cond_signal — it never touches the virtqueue.
  • The worker thread owns the virtqueue exclusively: it parses descriptors, performs I/O, adjusts the used ring, and injects IRQs, all from a single thread.
  • Synchronization is reduced to one signal/wakeup per guest notification (batch), rather than per-request lock/unlock on the procq.

Performance

Test setup: ramdisk backend, fio + io_uring, 120s runtime, RK3588.

Metric Old New Change
IOPS (4k rand read, qd=64) 89.0k 93.7k +5.3%
Throughput (1M seq read, qd=16) 3831 MiB/s 4026 MiB/s +5.1%
P99 tail latency (4k rand, qd=64) 906 us 775 us -14.5%
P99.99 tail latency (4k rand, qd=64) 113.8 ms 76.0 ms -33.2%

Bug Fixes

While refactoring, several pre-existing issues were fixed:

  • Crash on early close: Guard pthread_join so virtio_blk_close doesn't join an uninitialized thread handle if the worker was never started.
  • Incorrect written_len on read error: When preadv fails, written_len no longer carries -1 into the used-ring update (was reporting 0 bytes despite the status byte being written).
  • Missing written_len for GET_ID: Now reports correct data length via the used ring so the guest sees the full device ID string.
  • close(-1) on open failure: Removed spurious close(-1) in the error path when open() fails.
  • Block device size detection: When fstat returns st_size = 0 (real block devices), fall back to ioctl(BLKGETSIZE64).

@agicy agicy force-pushed the refactor-blk-worker branch from 47fa734 to 10eebf0 Compare June 9, 2026 12:44
@agicy agicy force-pushed the refactor-blk-worker branch from 10eebf0 to f90ae43 Compare June 9, 2026 12:54
@agicy agicy requested a review from li041 June 9, 2026 12:56
@agicy agicy marked this pull request as ready for review June 9, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant