Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider further adjustments to the headers structure. #205

Open
Lukasa opened this issue Apr 22, 2016 · 0 comments
Open

Consider further adjustments to the headers structure. #205

Lukasa opened this issue Apr 22, 2016 · 0 comments

Comments

@Lukasa
Copy link
Member

Lukasa commented Apr 22, 2016

@rbtcollins and I have had a really interesting discussion on IRC about #194, and in particular about the API decision that was made in that issue. For those who don't remember, the end result was that we have a tuple-like datastructure (actually a tuple-subclass) that has a flag on it to indicate whether a header field should be indexable at the HPACK layer.

@rbtcollins has pointed out that this API is a bit prone to mis-use. In particular, the HeaderTuple quacks very much like a tuple, right the way down to its equality comparison (NeverIndexedHeaderTuple('name', 'value') == ('name', 'value')). This runs the risk of users who incautiously use this API failing to notice the distinction, and worse, failing to preserve it when working with the header fields.

Now, this is mostly a concern for intermediaries. hyper-h2 does some automatic never-indexing for some fields, and in most cases for clients/servers that will be enough (and we'll likely add a new API for hooking into that shortly). However, it's vital that intermediaries preserve the relevant semantics of these header fields through their manipulations, and that is something that they could easily get wrong if they're insufficiently cautious.

So we should consider whether or not we can replace this with a better API. The "headers as list-of-tuples" API is a really good one, partly because it preserves all the relevant data about the header fields, but also because it makes header fields immutable data structures, which solves a whole lot of pain points (and indeed we should ask ourselves whether hyper-h2 should convert the header iterable to a tuple, instead of a list, when emitting it to the user on events). We want to preserve this as much as possible.

Three alternative ideas have been proposed, the first two of which of which are backward-incompatible with the current headers representation and would require a SemVer Major version bump.

  1. Some kind of composite data structure. Essentially, we'd change headers from a HeaderTuple to a dictionary: a bit like {'indexable': True, 'field': field}, where field is a HeaderTuple or tuple.
  2. Wrap the header tuple, rather than subclass. Essentially, have HeaderField = namedtuple('HeaderField', ['indexable', 'field']).
  3. Adjust the equality logic of HeaderTuple to prevent the NeverIndexedHeaderField from comparing equal to a regular tuple.

My thoughts:

  • If we believe this is a problem that requires an API change, I don't think (3) goes far enough. If this information is important enough to make users think about it, (3) doesn't do that: in particular, it makes it all-to-easy for users to accidentally throw this information away (by unpacking and repacking the tuple). It's basically only marginally better than the current state, and doesn't save users from their most likely mistake.
  • (1) loses some of our immutability wins: dicts are mutable, and that's pretty sad. Additionally, there's no dot-access. On the other hand, it's very easily extensible.
  • Of the set, (2) is probably my favourite.
  • Regardless of what we do, revising this API is a big change: we'll affect two functions (send_headers, push_stream) and five events (RequestReceived, ResponseReceived, InformationalResponseReceived, TrailersReceived, PushedStreamReceived), including two events that are in the basic set of events that all implementations will want to handle. Essentially, this means that if we do (1) or (2) we'll definitely break every single implementation that currently exists.

At this point I'd like to solicit some feedback from the community. Is this a problem that's big enough to justify a fairly substantial API revision? If so, do you have a preferred API design? If not, why not?

I'm going to CC: @mhils @Kriechi @hawkowl @sigmavirus24 and @jimcarreer as the community reps on this. Feel free to also CC anyone else who might have a useful opinion here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant