Skip to content

chore: add tracing instrumentation to mem_wal module#6430

Open
hamersaw wants to merge 1 commit intolance-format:mainfrom
hamersaw:chore/instrument-mem-wal
Open

chore: add tracing instrumentation to mem_wal module#6430
hamersaw wants to merge 1 commit intolance-format:mainfrom
hamersaw:chore/instrument-mem-wal

Conversation

@hamersaw
Copy link
Copy Markdown
Contributor

@hamersaw hamersaw commented Apr 7, 2026

Summary

  • Adds #[instrument] attributes from the tracing crate to key functions across the mem_wal module
  • Covers write path (RegionWriter::open, put, close), flush path (MemTableFlusher::flush, flush_with_indexes), WAL operations, manifest store, memtable inserts, scanner/planner, point lookups, and vector search
  • Uses appropriate trace levels (info for high-level operations, debug for internals) with relevant fields (region_id, epoch, row counts, batch counts)

Test plan

  • cargo check passes — no functional changes, only attribute additions
  • Existing mem_wal tests continue to pass
  • Tracing output verified with RUST_LOG=debug showing instrumented spans

🤖 Generated with Claude Code

@github-actions github-actions bot added the chore label Apr 7, 2026
Add #[instrument] spans to key functions across the mem_wal subsystem
(write, flush, scan, index, manifest) for improved observability.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@hamersaw hamersaw force-pushed the chore/instrument-mem-wal branch from 626d2f1 to ee8353c Compare April 7, 2026 20:40
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great.

Generally I try and always use skip_all since skip(...) has dinged us in the past on more than one occasion. The problem is that it is very easy to add a new argument and forget to add it to the skip list and then you never notice until some very difficult to debug perf issue pops up.

That being said, with code being generated by LLMs now, it might not be so easy to "forget to add it to the skip list". However, given that you're already specifying fields manually, I wonder if it would be easy enough to just specify all fields manually (allow list style) and use skip_all everywhere?

Comment on lines +188 to 189
#[instrument(level = "info", skip(self), fields(has_filter = self.filter.is_some(), limit = self.limit, num_shards = self.shard_snapshots.len()))]
pub async fn try_into_stream(&self) -> Result<SendableRecordBatchStream> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to generate something similar to the plan_run events that we have on the normal scanner (

tracing::info!(
)

Comment on lines 946 to 1264
@@ -1102,6 +1104,7 @@ impl ShardWriter {
/// Fencing is detected lazily during WAL flush via atomic writes.
/// If another writer has taken over, the WAL flush will fail with
/// `AlreadyExists`, indicating this writer has been fenced.
#[instrument(level = "info", skip(self, batches), fields(batch_count = batches.len(), shard_id = %self.config.shard_id))]
pub async fn put(&self, batches: Vec<RecordBatch>) -> Result<WriteResult> {
if batches.is_empty() {
return Err(Error::invalid_input("Cannot write empty batch list"));
@@ -1257,6 +1260,7 @@ impl ShardWriter {
/// Close the writer gracefully.
///
/// Flushes pending data and shuts down background tasks.
#[instrument(level = "info", skip(self), fields(shard_id = %self.config.shard_id, epoch = self.epoch))]
pub async fn close(self) -> Result<()> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to give these traces names like sw_open, sw_put, sw_close? A lot of tracing tools will just give the span name unless you drill down and open/put/close are going to feel like regularly filesystem calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants