Skip to content

Rust decoder shouldn't assume root payload is at position 0#2468

Open
simathih wants to merge 1 commit intoopen-telemetry:mainfrom
simathih:main
Open

Rust decoder shouldn't assume root payload is at position 0#2468
simathih wants to merge 1 commit intoopen-telemetry:mainfrom
simathih:main

Conversation

@simathih
Copy link
Copy Markdown

@simathih simathih commented Mar 31, 2026

Change Summary

Rust decoder shouldn't assume root payload is at position 0.
We know what root payload we're expecting so we can update the code to fill in the appropriate Logs/Metrics/Traces construct and then check for root payload presence at the end.

What issue does this PR close?

#2363

  • Closes #NNN

How are these changes tested?

Are there any user-facing changes?

@simathih simathih requested a review from a team as a code owner March 31, 2026 06:30
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Mar 31, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: simathih / name: Siddhartha Mathiharan (d51322f)

@github-actions github-actions bot added the rust Pull requests that update Rust code label Mar 31, 2026
@simathih simathih changed the title Rust decoder shouldn't assumes root payload is at position 0 Rust decoder shouldn't assume root payload is at position 0 Mar 31, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 31, 2026

Codecov Report

❌ Patch coverage is 72.72727% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.17%. Comparing base (f8611f8) to head (d51322f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2468      +/-   ##
==========================================
- Coverage   88.18%   88.17%   -0.01%     
==========================================
  Files         604      604              
  Lines      214589   214587       -2     
==========================================
- Hits       189232   189214      -18     
- Misses      24831    24847      +16     
  Partials      526      526              
Components Coverage Δ
otap-dataflow 90.11% <72.72%> (-0.02%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 90.74% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 52.44% <ø> (ø)
quiver 91.94% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lquerel
Copy link
Copy Markdown
Contributor

lquerel commented Mar 31, 2026

@JakeDern could you review this PR. Thanks

Err(Error::RecordBatchNotFound {
payload_type: expected,
})
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this can be simplified a bit to be more idiomatic Rust:

    if records.arrow_payloads.iter().any(|payload| {
        ArrowPayloadType::try_from(payload.r#type) == Ok(expected)
    }) {
        Ok(())
    } else {
        Err(Error::RecordBatchNotFound {
            payload_type: expected,
        })
    }

@lalitb
Copy link
Copy Markdown
Member

lalitb commented Mar 31, 2026

Welcome, and thanks for the first PR.

  • Consider adding regression test for the case where the root payload is not at index 0 in arrow_payloads, since that is the behavior this change is fixing.
  • Also, it looks like CI is currently failing on formatting, so please run cargo fmt.`

self.proto_buffer.clear();
self.logs_proto_encoder
.encode(&mut otap_batch, &mut self.proto_buffer)?;
check_payload_type_present(records, ArrowPayloadType::Logs)?;
Copy link
Copy Markdown
Contributor

@JakeDern JakeDern Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a separate ahead of time check for the root payload time. Most things are checked by from_record_messages which will fail if any of the records are invalid for the signal or have a misaligned schema.

It currently does not fail if there's no root record batch for the signal (perhaps something we should consider adding), so I think we just need to check that after we create the otap batch that otap_batch.get(ArrowPayloadType::Logs).is_some()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Similar feedback goes for the other signals as well)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, at this point. @JakeDern are you suggesting that we modify from_record_messages() to check for the root payload itself?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmacd My main feedback for this PR specifically is that we don't need to pre-scan the array for the Logs payload type, rather we can call from_record_messages::<Logs>() which will screen out invalid payload types for the signal. Then it's a quick O(1) check for the Logs payload type using get(). That covers all the behavior we have today.

Beyond this PR, I'm also raising the point that we should enforce that the root payload is always present in a more central way. Modifying from_record_messages to check for something like T::root_payload_type() (which would have to be added on OtapBatchStore) is probably better than doing the check in the decoder code but isn't a full solution to the problem as we have other paths for construction and modification i.e. set and remove APIs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also something I did a bit ago that I hope enables a unified path for this is splitting out the concept of a RawBatchStore which is basically just "const size storage for a collection of record batches": https://github.com/open-telemetry/otel-arrow/blob/37df685b98ec6e7edc1f0ed84d4d00c72c371bf7/rust/otap-dataflow/crates/pdata/src/otap/raw_batch_store.rs.

With that we could lock down creation of Logs to just the TryFrom<RawLogsStore> implementation and enforce spec compliance at that boundary. Then from_record_messages would collect into a RawLogsStore and then just call the TryFrom implementation. But I'm sure we'll have to touch a lot of callers to lock that down 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants