Conversation
grpc/src/metadata/key.rs
Outdated
| /// [`MetadataMap`]: crate::metadata::MetadataMap | ||
| #[derive(Clone, Eq, PartialEq, Hash)] | ||
| #[repr(transparent)] | ||
| pub struct MetadataKey<VE: ValueEncoding> { |
There was a problem hiding this comment.
Isn't it better practice to omit the ValueEncoding constraint here since we aren't using the ValueEncoding type in the struct definition itself?
There was a problem hiding this comment.
Yes, it is preferred to have the constraints on the impl blocks to avoid repetition everywhere the struct is used. Removed the constraint.
There was a problem hiding this comment.
There are a few other places where you can make the same change -- a couple things that previously needed the ValueEncoding constraint themselves to use MetadataKey, and MetadataValue
There was a problem hiding this comment.
I've removed the ValueEncoding bound from all structs now.
| } | ||
|
|
||
| macro_rules! from_integers { | ||
| ($($name:ident: $t:ident => $max_len:expr),*) => {$( |
There was a problem hiding this comment.
I don't see where max_len is used?
There was a problem hiding this comment.
Good catch, this was a copy/paste error. This macro originally comes from the http crate here, where it uses the itoa crate for efficient integer-to-string conversion. Tonic's MetadataMap version of this macro delegates to the HeaderMap implementation, which is why the max_len parameter ends up unused.
For gRPC, however, the value type isn't just a wrapper around http::HeaderValue, so we actually need the original macro implementation from the http crate. I used an inefficient workaround to get the code to compile initially. I've updated this now.
There was a problem hiding this comment.
Does this change any of your benchmark numbers?
There was a problem hiding this comment.
It shouldn't since the benchmarks were not using integer values.
Co-authored-by: Doug Fawley <dfawley@google.com>
dfawley
left a comment
There was a problem hiding this comment.
LGTM modulo the ValueEncoding constraint changes.
|
@arjan-bal any idea if we could just deprecate the current tonic one and replace it with this? Would there be any challenges there? |
grpc/src/metadata/value.rs
Outdated
| pub struct MetadataValue<VE: ValueEncoding> { | ||
| // Note: There are unsafe transmutes that assume that the memory layout | ||
| // of MetadataValue is identical to UnencodedHeaderValue. | ||
| pub(crate) inner: UnencodedHeaderValue, |
There was a problem hiding this comment.
I think there has to be a better way to protect this invariant at a sub module boundary rather than a comment to guard this? To me the smell here is 1) comment protecting an invariant and 2) compounded with a item that is pub(crate) so this invariant must be held true beyond this module boundary.
There was a problem hiding this comment.
The #[repr(transparent)] macro ensures at compile time that the struct has exactly one non-zero-sized field.
I have replaced the pub(crate) field with a pub(crate) method. I looked into safer alternatives for casting &T to a newtype &Wrapper(T), but relying on repr(transparent) alongside the unsafe block appears to be the standard and idiomatic approach in Rust.
grpc/src/metadata/value.rs
Outdated
|
|
||
| #[derive(Clone, PartialEq, Eq)] | ||
| pub struct UnencodedHeaderValue { | ||
| pub(crate) data: Bytes, |
There was a problem hiding this comment.
I would like us to use less pub(crate) and expose stuff via functions that are pub(crate) rather than struct items.
There was a problem hiding this comment.
Replaced with a pub(crate) method.
grpc/src/metadata/value.rs
Outdated
|
|
||
| impl fmt::Debug for UnencodedHeaderValue { | ||
| fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { | ||
| let mut hv = unsafe { HeaderValue::from_maybe_shared_unchecked(self.data.clone()) }; |
There was a problem hiding this comment.
Needs a safety comment on why this is safe to do
There was a problem hiding this comment.
On second thought, this approach isn't completely safe. An UnencodedHeaderValue might contain raw binary data that isn't a valid http::HeaderValue, which will cause a panic during tests. To resolve this, I've replaced the unsafe HeaderValue parsing with a custom impl Debug copied directly from http::HeaderValue.
| /// correct `Ascii` or `Binary` value encoding. | ||
| #[inline] | ||
| pub(crate) fn unchecked_from_header_value_ref(header_value: &UnencodedHeaderValue) -> &Self { | ||
| unsafe { &*(header_value as *const UnencodedHeaderValue as *const Self) } |
There was a problem hiding this comment.
I dont know its been a while but I think there are safer ways to do transmutes like this?
There was a problem hiding this comment.
I looked into safer alternatives for casting &T to a newtype &Wrapper(T), but relying on repr(transparent) alongside the unsafe block appears to be the standard and idiomatic approach in Rust.
grpc/src/metadata/value.rs
Outdated
| } | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
These tests should be in a cfg(test) module so that they dont compile in during regular builds
There was a problem hiding this comment.
Nice catch! Fixed it. I went back and checked the tonic metadata, and it has the same problem:
tonic/tonic/src/metadata/value.rs
Lines 790 to 807 in c1f07e7
| self.headers.retain(|(name, value)| { | ||
| let key_and_value = if !name.as_str().ends_with("-bin") { | ||
| KeyAndValueRef::Ascii( | ||
| MetadataKey::unchecked_from_header_name_ref(name), |
There was a problem hiding this comment.
iirc these call unsafe but the input invariant is not held up here. Aka I would expect this block to also be unsafe but its not clear. This could let me void the invariant by passing in something wrong which is footgun prone.
There was a problem hiding this comment.
Key/value validation occurs before insertion into the map. Because this loop iterates over entries already present in the map, I believe we can safely assume the values are valid. While users can bypass validation by passing invalid values through the unsafe MetadataValue::from_shared_unchecked function, doing so is expected to result in undefined behavior.
For reference, tonic's MetadataMap implements a very code here:
tonic/tonic/src/metadata/map.rs
Lines 1237 to 1252 in c1f07e7
| } | ||
|
|
||
| impl<VE: ValueEncoding> AsMetadataKey<VE> for &MetadataKey<VE> { | ||
| #[doc(hidden)] |
There was a problem hiding this comment.
why doc hidden and not sealed? I think there are also options to do kinda like a layered trait approach where you expose a public trait but implementations end up in a private internal trait
There was a problem hiding this comment.
I have switched to the sealed trait pattern from Predrag's blog. I previously used the approach of defining methods on a private supertrait, but I realized it has a significant flaw.
Normally, calling a trait method requires the trait to be explicitly in scope. However, generic trait bounds implicitly bring the methods of all supertraits into scope for that generic type. This loophole allows downstream users to bypass module privacy and invoke internal methods. Here is a Playground link demonstrating how an external user can still access the add method using this trick.
I actually discovered this behavior while looking at tonic's MetadataMap. There is a call to a sealed method that intuitively shouldn't be possible, but the compiler permits it because of this exact generic trait bound loophole:
tonic/tonic/src/metadata/value.rs
Lines 131 to 133 in c1f07e7
LucioFranco
left a comment
There was a problem hiding this comment.
Overall looking really good. I think we need to address the safety hygiene or else this code can become very footgunny. Lmk if you want to work through that together.
From my message on Discord:
As discussed, I've added |
|
#2585 should fix the failing |
This PR introduces the
MetadataMapstruct, which will serve as a replacement for the tonic MetadataMap for gRPC.Why
gRPC Rust requires a Metadata API that is consistent with the gRPC over HTTP/2 protocol and capable of supporting true binary metadata (gRFC G1) in the future.
While evaluating the feasibility of reusing tonic's
MetadataMap, two major limitations were identified:Fixing these would involve major breaking changes. For more context, see tonic#2487.
Solution
The
MetadataMapimplementation in this change is adapted from Tonic and modified in the following ways:Core Architectural Changes:
Vec: The internalhttp::HeaderMapis replaced with aVec, inspired by gRPC Java. This makes operations like insertion faster by avoiding hashing overhead. While operations like deletion andgetbecome O(n), they are typically faster in practice for small collections (fewer than 15 items) due to cache locality. Note: This does incur an additional allocation when converting theMetadataMapto anhttp::HeaderMap.BinaryMetadataValueconstructors now accept unencoded data.append()no longer returns abool, which allows it to remain O(1) when using aVec.map.retainoperation is added, allowing users to remove multiple keys in a single pass. This helps overcome the O(n) per-deletion cost by allowing users to efficiently filter the map.API Removals & Simplifications:
is_sensitiveflag, which most users are not expected to update.remove_allandremove_all_binmethods. To keep the API simple, these do not return an iterator of the removed values. Doing so would requireVec::extract_if, which introduces an unnameable return type (-> impl Iterator<Item = ...>) and forces users to either explicitly consume the iterator or suppress#[must_use]compiler warnings. Users who need to inspect values before removal can use theretainmethod instead. Dedicatedextract_allandextract_all_binmethods can be added in the future if needed.keys_len(): There is no way to find the number of unique keys faster than O(n) in a vector, so users should rely on an iterator instead.http::HeaderValuefunction to operate efficiently.Performance
Benchmarks for common operations show that the new vector-backed metadata map performs comparably to Tonic's implementation, particularly excelling at smaller sizes and iteration.
grpc_metadata_maptonic_metadata_map