Skip to content

feat(grpc): Metadata API#2567

Open
arjan-bal wants to merge 19 commits intohyperium:masterfrom
arjan-bal:new-metadata
Open

feat(grpc): Metadata API#2567
arjan-bal wants to merge 19 commits intohyperium:masterfrom
arjan-bal:new-metadata

Conversation

@arjan-bal
Copy link
Copy Markdown
Collaborator

@arjan-bal arjan-bal commented Mar 23, 2026

This PR introduces the MetadataMap struct, which will serve as a replacement for the tonic MetadataMap for gRPC.

Why

gRPC Rust requires a Metadata API that is consistent with the gRPC over HTTP/2 protocol and capable of supporting true binary metadata (gRFC G1) in the future.

While evaluating the feasibility of reusing tonic's MetadataMap, two major limitations were identified:

  1. Binary metadata is base64 encoded prior to insertion.
  2. Looser validation of header keys and values than the gRPC specification dictates.

Fixing these would involve major breaking changes. For more context, see tonic#2487.

Solution

The MetadataMap implementation in this change is adapted from Tonic and modified in the following ways:

Core Architectural Changes:

  • Backed by a Vec: The internal http::HeaderMap is replaced with a Vec, inspired by gRPC Java. This makes operations like insertion faster by avoiding hashing overhead. While operations like deletion and get become O(n), they are typically faster in practice for small collections (fewer than 15 items) due to cache locality. Note: This does incur an additional allocation when converting the MetadataMap to an http::HeaderMap.
  • Unencoded Binary Data: All BinaryMetadataValue constructors now accept unencoded data.
  • O(1) Appends: append() no longer returns a bool, which allows it to remain O(1) when using a Vec.
  • Efficient Bulk Deletion: A map.retain operation is added, allowing users to remove multiple keys in a single pass. This helps overcome the O(n) per-deletion cost by allowing users to efficiently filter the map.

API Removals & Simplifications:

  • Removed Mutable Iterators: The only mutable state in the metadata value is the is_sensitive flag, which most users are not expected to update.
  • Removed Entry API: The Entry types for in-place map insertion were removed. Other gRPC implementations do not support this, and it is not a requested feature.
  • APIs to remove all values for a key: Added the remove_all and remove_all_bin methods. To keep the API simple, these do not return an iterator of the removed values. Doing so would require Vec::extract_if, which introduces an unnameable return type (-> impl Iterator<Item = ...>) and forces users to either explicitly consume the iterator or suppress #[must_use] compiler warnings. Users who need to inspect values before removal can use the retain method instead. Dedicated extract_all and extract_all_bin methods can be added in the future if needed.
  • Removed keys_len(): There is no way to find the number of unique keys faster than O(n) in a vector, so users should rely on an iterator instead.
  • Removed Keys-only and Values-only Iterators: Most use-cases for keys or values can be addressed directly by the standard key-value iterator.
  • Removed ASCII Conversion: The function to convert an ASCII key to an ASCII value is removed, as it relied on an http::HeaderValue function to operate efficiently.

Performance

Benchmarks for common operations show that the new vector-backed metadata map performs comparably to Tonic's implementation, particularly excelling at smaller sizes and iteration.

Operation Size grpc_metadata_map tonic_metadata_map
Insert 5 601.69 ns 704.61 ns
10 1.4833 µs 1.4685 µs
20 3.2531 µs 2.8948 µs
Append 5 648.26 ns 704.68 ns
10 1.3886 µs 1.4441 µs
20 2.8869 µs 2.8140 µs
Get 5 38.579 ns 62.770 ns
10 135.19 ns 125.78 ns
20 313.42 ns 257.20 ns
Iter 5 39.626 ns 48.951 ns
10 78.343 ns 96.746 ns
20 156.01 ns 191.90 ns

@dfawley dfawley self-assigned this Mar 23, 2026
@dfawley dfawley self-requested a review March 23, 2026 23:21
@LucioFranco LucioFranco self-requested a review March 24, 2026 16:47
/// [`MetadataMap`]: crate::metadata::MetadataMap
#[derive(Clone, Eq, PartialEq, Hash)]
#[repr(transparent)]
pub struct MetadataKey<VE: ValueEncoding> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better practice to omit the ValueEncoding constraint here since we aren't using the ValueEncoding type in the struct definition itself?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is preferred to have the constraints on the impl blocks to avoid repetition everywhere the struct is used. Removed the constraint.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few other places where you can make the same change -- a couple things that previously needed the ValueEncoding constraint themselves to use MetadataKey, and MetadataValue

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the ValueEncoding bound from all structs now.

}

macro_rules! from_integers {
($($name:ident: $t:ident => $max_len:expr),*) => {$(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where max_len is used?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, this was a copy/paste error. This macro originally comes from the http crate here, where it uses the itoa crate for efficient integer-to-string conversion. Tonic's MetadataMap version of this macro delegates to the HeaderMap implementation, which is why the max_len parameter ends up unused.

For gRPC, however, the value type isn't just a wrapper around http::HeaderValue, so we actually need the original macro implementation from the http crate. I used an inefficient workaround to get the code to compile initially. I've updated this now.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change any of your benchmark numbers?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't since the benchmarks were not using integer values.

@dfawley dfawley assigned arjan-bal and unassigned dfawley Mar 26, 2026
@arjan-bal arjan-bal assigned dfawley and unassigned arjan-bal Mar 30, 2026
@dfawley dfawley assigned arjan-bal and unassigned dfawley Mar 30, 2026
Copy link
Copy Markdown
Collaborator

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo the ValueEncoding constraint changes.

@LucioFranco
Copy link
Copy Markdown
Member

@arjan-bal any idea if we could just deprecate the current tonic one and replace it with this? Would there be any challenges there?

pub struct MetadataValue<VE: ValueEncoding> {
// Note: There are unsafe transmutes that assume that the memory layout
// of MetadataValue is identical to UnencodedHeaderValue.
pub(crate) inner: UnencodedHeaderValue,
Copy link
Copy Markdown
Member

@LucioFranco LucioFranco Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there has to be a better way to protect this invariant at a sub module boundary rather than a comment to guard this? To me the smell here is 1) comment protecting an invariant and 2) compounded with a item that is pub(crate) so this invariant must be held true beyond this module boundary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The #[repr(transparent)] macro ensures at compile time that the struct has exactly one non-zero-sized field.

I have replaced the pub(crate) field with a pub(crate) method. I looked into safer alternatives for casting &T to a newtype &Wrapper(T), but relying on repr(transparent) alongside the unsafe block appears to be the standard and idiomatic approach in Rust.


#[derive(Clone, PartialEq, Eq)]
pub struct UnencodedHeaderValue {
pub(crate) data: Bytes,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like us to use less pub(crate) and expose stuff via functions that are pub(crate) rather than struct items.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with a pub(crate) method.


impl fmt::Debug for UnencodedHeaderValue {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
let mut hv = unsafe { HeaderValue::from_maybe_shared_unchecked(self.data.clone()) };
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a safety comment on why this is safe to do

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this approach isn't completely safe. An UnencodedHeaderValue might contain raw binary data that isn't a valid http::HeaderValue, which will cause a panic during tests. To resolve this, I've replaced the unsafe HeaderValue parsing with a custom impl Debug copied directly from http::HeaderValue.

/// correct `Ascii` or `Binary` value encoding.
#[inline]
pub(crate) fn unchecked_from_header_value_ref(header_value: &UnencodedHeaderValue) -> &Self {
unsafe { &*(header_value as *const UnencodedHeaderValue as *const Self) }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know its been a while but I think there are safer ways to do transmutes like this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into safer alternatives for casting &T to a newtype &Wrapper(T), but relying on repr(transparent) alongside the unsafe block appears to be the standard and idiomatic approach in Rust.

}
}

#[test]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests should be in a cfg(test) module so that they dont compile in during regular builds

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Fixed it. I went back and checked the tonic metadata, and it has the same problem:

#[test]
fn test_debug() {
let cases = &[
("hello", "\"hello\""),
("hello \"world\"", "\"hello \\\"world\\\"\""),
("\u{7FFF}hello", "\"\\xe7\\xbf\\xbfhello\""),
];
for &(value, expected) in cases {
let val = AsciiMetadataValue::try_from(value.as_bytes()).unwrap();
let actual = format!("{val:?}");
assert_eq!(expected, actual);
}
let mut sensitive = AsciiMetadataValue::from_static("password");
sensitive.set_sensitive(true);
assert_eq!("Sensitive", format!("{sensitive:?}"));
}

self.headers.retain(|(name, value)| {
let key_and_value = if !name.as_str().ends_with("-bin") {
KeyAndValueRef::Ascii(
MetadataKey::unchecked_from_header_name_ref(name),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc these call unsafe but the input invariant is not held up here. Aka I would expect this block to also be unsafe but its not clear. This could let me void the invariant by passing in something wrong which is footgun prone.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key/value validation occurs before insertion into the map. Because this loop iterates over entries already present in the map, I believe we can safely assume the values are valid. While users can bypass validation by passing invalid values through the unsafe MetadataValue::from_shared_unchecked function, doing so is expected to result in undefined behavior.

For reference, tonic's MetadataMap implements a very code here:

fn next(&mut self) -> Option<Self::Item> {
self.inner.next().map(|item| {
let (name, value) = item;
if Ascii::is_valid_key(name.as_str()) {
KeyAndValueRef::Ascii(
MetadataKey::unchecked_from_header_name_ref(name),
MetadataValue::unchecked_from_header_value_ref(value),
)
} else {
KeyAndValueRef::Binary(
MetadataKey::unchecked_from_header_name_ref(name),
MetadataValue::unchecked_from_header_value_ref(value),
)
}
})
}

}

impl<VE: ValueEncoding> AsMetadataKey<VE> for &MetadataKey<VE> {
#[doc(hidden)]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doc hidden and not sealed? I think there are also options to do kinda like a layered trait approach where you expose a public trait but implementations end up in a private internal trait

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have switched to the sealed trait pattern from Predrag's blog. I previously used the approach of defining methods on a private supertrait, but I realized it has a significant flaw.

Normally, calling a trait method requires the trait to be explicitly in scope. However, generic trait bounds implicitly bring the methods of all supertraits into scope for that generic type. This loophole allows downstream users to bypass module privacy and invoke internal methods. Here is a Playground link demonstrating how an external user can still access the add method using this trick.

I actually discovered this behavior while looking at tonic's MetadataMap. There is a call to a sealed method that intuitively shouldn't be possible, but the compiler permits it because of this exact generic trait bound loophole:

pub fn to_bytes(&self) -> Result<Bytes, InvalidMetadataValueBytes> {
VE::decode(self.inner.as_bytes())
}

Copy link
Copy Markdown
Member

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking really good. I think we need to address the safety hygiene or else this code can become very footgunny. Lmk if you want to work through that together.

@arjan-bal
Copy link
Copy Markdown
Collaborator Author

@arjan-bal any idea if we could just deprecate the current tonic one and replace it with this? Would there be any challenges there?

From my message on Discord:

I can think of two good approaches:

  1. Two independent structs
    We implement From/To for converting Tonic MD <-> gRPC MD
    Pros: Decouples the structs. We can put the adapters in a separate crate (e.g., tonic-grpc-compat) to avoid circular dependencies.
    Cons: Slightly inefficient for binary metadata due to an extra base64 encode/decode, though this overhead disappears once the user fully migrates to gRPC APIs.

  2. Single Struct
    We change the Tonic MD in-place to store both the encoded and raw binary metadata side-by-side.
    Pros: Supports both the existing Tonic APIs (yielding refs to the encoded data) and the new gRPC APIs (yielding refs to the raw data) simultaneously.
    Migration path: We deprecate the APIs gRPC doesn't want to support. Once removed, we update the metadata to store only raw, unencoded binary data.
    Cons: Higher implementation complexity.

As discussed, I've added From/Into implementations fro converting Tonic and gRPC types in this PR.

@arjan-bal
Copy link
Copy Markdown
Collaborator Author

#2585 should fix the failing check tests.

@arjan-bal arjan-bal assigned LucioFranco and unassigned arjan-bal Apr 8, 2026
@arjan-bal arjan-bal requested a review from LucioFranco April 8, 2026 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants