Make HTTP a hidden implementation detail, don't expose it in the public API #90
Comments
Ping @carllerche, tower-grpc maintainer, and @jtattermusch, creator of the C# gRPC, who might be able to provide insight and/or say I'm wrong. |
We should survey what other libs (Go, C++, C#, ...) do in this case. If none of the primary libs provide access to the HTTP types, then it seems fine to encapsulate. That said, the decision to encapsulate HTTP is separate from providing an API for metadata. We should wrap the HeaderMap with a MetadataMap and provide metadata read / write APIs. Thoughts? |
Encapsulating HeaderMap in a MetadataMap sounds like a reasonable first step. For that we can provide means to convert to/from a HeaderMap. For tower_grpc::Response it may not make sense to store both the current http::Response and a MetadataMap because then those two objects can have conflicting values for headers. So there it's probably better to break it down and store a MetadataMap and whatever http::Response contains except headers. |
tower_grpc::Response does not need to contain the http::Response, it can stick to what it needs. |
Did some research on other gRPC libraries: For grpc-core (C/C++) I'm quite sure that no underlying HTTP types are exposed. I have used it myself and know that API fairly well. For the other ones I read the API docs and could not find any explicit exposure of HTTP except for when servers and clients are fist instantiated. Java specifically can be used with either Netty, okhttp or an in-memory backend so it really can't expose the underlying HTTP library to the user since there isn't a single one. Go has some methods with "Header" in their names but the docs describe them as means of getting custom metadata, not HTTP headers directly. Details: Go gRPC: Server side: Handlers get context.Context and the protobuf object. The Context has:
The API docs for Go gRPC https://godoc.org/google.golang.org/grpc does not mention HTTP2 except for:
C# gRPC API docs are at https://grpc.io/grpc/csharp/api/Grpc.Core.html Server side: Handler methods get the protobuf object and a ServerCallContext ServerCallContext https://grpc.io/grpc/csharp/api/Grpc.Core.ServerCallContext.html has:
So there are HTTP related things but I cannot find any direct HTTP references in the docs there. Java gRPC docs are at https://grpc.io/grpc-java/javadoc/ Server side: Handlers get the request proto message and an object to reply with. gRPC Java has multiple transports implemented: The main Netty based one, an Android client-only implementation based on okhttp and io.grpc.inprocess which is notable because it doesn’t use HTTP. |
Ok, I would be fine hiding To avoid having to dup the map, I would have: struct MetadataMap {
headers: http::HeaderMap,
} Then do the metadata conversion on the fly. |
Wow, thanks for the in-depth report, awesome! I also think it's fine to start to hide the HTTPness here. An alternative (which I'm not saying is better, just that we should consider) to having a |
@seanmonstar it seems like it would be useful to be able to pass the metadata map around, if it is on |
I see now that the HeaderMap interface is pretty big. It takes quite a bit of wrapping to wrap all the things, including the iters and all other little details. I agree with @carllerche in that I think there is merit in being able to pass around a MetadataMap without also passing around the I've started working on an implementation based on struct MetadataMap {
headers: http::HeaderMap,
} |
@per-gron It's totally fine to start incrementally and not expose the entire interface initially. |
|
It looks like the variants are:
A possible API: have |
@seanmonstar Fyi non-"-bin" fields can have non-ASCII too. The gRPC language interop test suite tests support for sending UTF-8 there, and Here is a potential approach for handling "-bin" fields:
Another option is to have two separate sets of accessor methods with zero overlap. Then, to enumerate all metadata the user needs to enumerate string and binary entries separately. A variant of this option is to also have separate A third option which is more similar to what other gRPC libs to but that maybe feels un-Rusty is to just ditch type safety and allow all entries to contain binary data and then fail when converting it to a HTTP HeaderMap object. @seanmonstar @carllerche do you have a preference for what to do here? To me it seems like the second option of segregating the map into effectively two separate parts is a nice way to go because that is probably how people will use it in practice anyway. |
@per-gron I am kind of leaning towards just doing runtime checks. Something like: pub struct MetadataValue {
is_bin: bool,
data: HeaderValue,
} Then have separate constructors: impl MetadataValue {
fn string(s: &str) -> MetadataValue { }
fn bytes(s: &[u8]) -> MetadataValue { }
fn as_str(&self) -> Option<&str> { }
} Some questions: If string values support UTF-8 values, how are they encoded in headers? Perhaps the binary repr is a valid header value? Can there be both a string and -bin entry for the same name? For example, can the metadata map contain both "foo" and "foo-bin" keys? |
string metadata values support all characters that HTTP headers can have, which I think is something like bytes >32 and != 127. gRPC doesn't encode them, they are just raw HTTP headers. As far as I can tell, nothing prevents having "foo" and "foo-bin". Other gRPC libraries require users to explicitly call the metadata keys "-bin" or non-"-bin". |
Hi there, I'm the maintainer of gRPC C# and contributor for other gRPC languages. I just wanted to confirm what @per-gron is saying:
to summarize:
btw, if there's a draft for the tower-grpc public API, gRPC team can help reviewing it. |
@jtattermusch Thanks for the thoughts, that makes sense. As per an API draft, tower-grpc's API is experimental. At some point (soon hopefully), the API will be nailed down a bit more. I will ping you at that point 👍 |
As per the discussion in tower-rs#90, it wraps http::HeaderMap to abstract away the fact that tower-grpc is HTTP based. At this time this class is a pure wrapper that does not actually implement any of the gRPC metadata specific stuff (such as binary fields), that will come later.
Phew, the metadata map interface turned out to be a lot bigger than I expected. I uploaded the PR hyperium/http#278 against hyperium http to be able to fully wrap it. I also uploaded #92 as a large step towards hiding HTTP types from Tower's public interface. This is an important step towards being able to handle binary metadata fields properly, although it's just a refactoring so far. |
@jtattermusch can you provide links to docs or something about gRPC specific upcasing/downcasing rules? I can't find much about it. tower-grpc currently uses hyperium http's header handling which is very careful about http2 conformance, including case sensitivity. |
@per-gron per the gRPC spec, all metadata keys need to be lower-case only: https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md (see "Header-Name"). There rest of the behavior comes from http2 spec: see summary here: https://blog.yaakov.online/http-2-header-casing/ (all header names must be converted to lower cases before sent on the wire) |
@jtattermusch thanks for the info! Hyperium already enforces that all header names are lower case, so I think we're good on that point. |
@carllerche Working on making MetadataMap actually automatically encode binary entries. At the moment I'm hacking on a type system based approach rather than your suggested runtime check approach, because 1) with the unsafe transmutes I am not able to store that additional flag with |
@per-gron I see what you mean. Could you post a sketch of the type system strategy? I would hate for you to spent a ton of time on a path and we opt to go a different route. Also, what is more common? String values or binary values? |
@carllerche I don't know if string or binary values are more common. My guess is that string values are more common and binary only used when you need to store richer data, in practice protobuf messages. I have a wip branch at https://github.com/per-gron/tower-grpc/blob/request-metadata-bin/src/metadata_map.rs in case you want to see the gory details but perhaps that can wait until you actually review that upcoming PR. Quick summary of the approach is: Have a trait To get this, you have There's a lot of details in the implementation of this approach that I'm a bit unsure about. For example, I don't really like that |
@per-gron Thanks for the summary. I will have to spend some time thinking about the problem before being able to provide additional thoughts... Regarding |
* Rename Request::headers to Request::metadata * Introduce tower_grpc::metadata::MetadataMap type As per the discussion in #90, it wraps http::HeaderMap to abstract away the fact that tower-grpc is HTTP based. At this time this class is a pure wrapper that does not actually implement any of the gRPC metadata specific stuff (such as binary fields), that will come later. * Address some review comments * Use unsafe transmutes to make MetadataMap <=> HeaderMap conversion faster * Address review comments * Make the interop tests build * Say why some code is commented out * Use unsafe transmutes for returning &MetadataKey from MetadataMap This makes the API more self-symmetrical and more similar to HeaderMap. * Use newly exported IterMut and ValuesMut from http to resolve TODOs * Add #[repr(transparent)] annotations to types that rely on that behavior
As per the discussion in tower-rs#90, it wraps http::HeaderMap to abstract away the fact that tower-grpc is HTTP based. At this time this class is a pure wrapper that does not actually implement any of the gRPC metadata specific stuff (such as binary fields), that will come later.
As per discussion in tower-rs#90. This commit also fixes a bug that made tower-grpc provide the status code in the error message field.
* Don't expose HTTP HeaderMap in tower_grpc::Error enum As per discussion in #90. This commit also fixes a bug that made tower-grpc provide the status code in the error message field. * Perform percent decoding on error messages As per gRPC spec. This commit also adds the special_status_message client interop test to validate this behavior.
@seanmonstar I think all the changes proposed in this ticket are done by now. Please reopen if I'm missing something. |
Right now, tower-grpc exposes to its users that it uses HTTP in a bunch of places. An example is the
tower_grpc::Error::Grpc
enum variant, which exposes thehttp::HeaderMap
type from hyperium/http. Another example is thetower_grpc::Response
type, which hasinto_http
andfrom_http
methods to convert to and fromhttp::Response
values.I think the tower-grpc API should be changed so that no HTTP specific types are visible to its users. There are both philosophical, practical and experience based reasons for this:
Reasons
Philosophical argument
gRPC is designed as a protocol that is agnostic to its underlying transport. The gRPC over HTTP2 specification document begins with "This document serves as a detailed description for an implementation of gRPC carried over HTTP2 framing."
It can be argued that it's unlikely that users of tower-grpc are going to ever want any other underlying transport, but we don't really know that. In addition, and this is perhaps the stronger reason, gRPC is a useful and easy to reason about subset of HTTP2, which on its own is very complex. By exposing raw HTTP2 constructs to users of tower-grpc, users are unnecessarily exposed to complexities of HTTP2, for little to no benefit.
Practical argument
The direct exposure of HTTP primitives, at least without helpers which don't exist today, makes it difficult or impossible to properly implement some of the gRPC language interop tests.
special_status_message
is about sending an error response with Unicode text, which gRPC over HTTP2 requires to be percent encoded. With tower-grpc's current design, this percent encoding has to be done by end user code, which is very error prone and not necessary. All other gRPC libraries handle this automatically. Even without the complication of the percent encoding, requiring the user to know that you should check the "grpc-message" HTTP header to get the error message is a very unintuitive API.Similarly, for the
custom_metadata
interop test, the user of the API has to manually do base64 encoding and decoding of custom metadata fields with a "-bin" suffix, which gRPC over HTTP2 requires that the gRPC library does for the user. In addition, tower-grpc does currently not support trailing custom metadata at all because thetower_grpc::Response
is just a wrapper aroundhttp::Response
, which happens to not store trailers (the http library seems to make it possible to read them it's just not stored in that struct).Placing the burden of percent encoding error messages, base64 encoding some headers, and even just requiring the knowledge that gRPC custom metadata is sent over headers, and what the header name for error messages is, is just not nice. I would even argue that, at least without helper functions, it violates the spec.
Experience based argument
To my knowledge, every widely used gRPC library in existence completely hides HTTP from their users. Custom metadata is transferred with headers but exposed under the name "custom metadata", etc. Also, the data type for storing an error response (the equivalent of
tower_grpc::Error
) stores only the status code, the error message and the binary error details, not arbitrary HTTP header data. See for example the C++ API:grpc::Status
Because all of the widely used gRPC libraries get away with not exposing low level HTTP2 details to their users, I think it's highly likely that the same will be true for tower-grpc. In fact, just being more similar to the other gRPC libraries is on its own a good reason for this because of the reduced learning curve. (There are ways that tower-grpc is different from other gRPC libraries because it's in Rust, but I don't think this is one of those.)
Proposed changes
Simplify
tower_grpc::Error
to only store the information that other gRPC libaries store. Don't store HTTP stuff, and don't store prost object or anything else. gRPC specifies how to map HTTP errors and other kinds of errors into this format. Although not strongly typed the gRPC Status type is powerful enough to encode any kind of error. As discussed in Refine Error type #6, doing this would make the API more friendly to use as well because at the moment a server side error is not the same type as a client side error, which causes pain when servers need to propagate errors.Similarly, hide HTTP types from the
tower_grpc::Request
andtower_grpc::Response
types. Only expose fields in terms of gRPC. This would make the API more similar to the other gRPC implementations, make it easier to understand because users won't have to know the details of the gRPC/HTTP2 mapping, and make it possible to make a more friendly API for custom metadata.The text was updated successfully, but these errors were encountered: