Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalising the Transport API for 1.0. #1274

Closed
tomchristie opened this issue Sep 10, 2020 · 5 comments
Closed

Finalising the Transport API for 1.0. #1274

tomchristie opened this issue Sep 10, 2020 · 5 comments
Labels
user-experience Ensuring that users have a good experience using the library
Milestone

Comments

@tomchristie
Copy link
Member

tomchristie commented Sep 10, 2020

There's a few bits of thinking around the Transport API that we'd like to be completely set on before we roll with a 1.0 release.

  • We should consider splitting the sync/async interface as .request/.arequest. This would allow transport implementations that support both sync+async usage, which would be a really nice usability for some use-cases.
  • There's a very strong case to be made for .request being a context manager. See Should Transport.request() return a context manager? httpcore#145 - Needs some very careful investigation.
  • We need to do a bit of thinking around making custom transports easier to use. Eg from a POV of not having to specify defaults verbosely as soon as dropping down to that level. The docs at https://www.python-httpx.org/advanced/#custom-transports are a good example of where we could be making some usability improvements.
  • We ought to consider exposing something like httpx.HTTPTransport(...) rather than forcing users to drop into httpcore once they're configuring transports directly.
  • The work on exposing the content stream API is also somewhat related... See Exposing the content stream API #1253

To give a few examples of the effect of some of these decisions...

Expose low-level configuration options on the transport but not the client, without having gnarly extra configuration needed...

client = httpx.Client(transport=httpx.HTTPTransport(local_address="..."))

Mount a customised transport for one specific domain, without having gnarly extra configuration needed...

client = httpx.Client(..., mounts={"all://www.example.com": httpx.HTTPTransport(verify=False, retries=1)})

Transport API using .request and .arequest...

# A custom transport class, that always redirects to https URLs, and supports both sync and async API,
# meaning it can be used with either kind of client.
client = httpx.Client(..., mounts={"http": HTTPSRedirect()})
client = httpx.AsyncClient(..., mounts={"http": HTTPSRedirect()})
@tomchristie tomchristie added the user-experience Ensuring that users have a good experience using the library label Sep 10, 2020
@tomchristie tomchristie added this to the v1.0 milestone Sep 10, 2020
@tomchristie
Copy link
Member Author

tomchristie commented Sep 18, 2020

I've been doing some thinking around this, and I do think we'll want to re-jig the Transport API slightly.
Here's where I'm at...

1. Supporting both sync & async on a single class

I think we really want .request() and .arequest(). Being able to have classes such as a single httpx.MockTransport(handler) that can work with either case is a really good argument in favour of this. Additionally I can imagine midleware-like functionality such as HTTPSOnlyTransport, HSTSTransport that'd benefit from being able to work with both async and sync clients. Switching to .request/.arequest is also a really easy one for us to do.

The next set is a bit more complex to talk through. So...

2. Optional context / extensions

We're aware that we might at some point want to add functionality such as trailing headers #1149 or server push, that don't currently fit in to the existing API. The proposal for trailing headers was that when we got there we could essentially just bolt it on to the content streams as an additional method.

We also have some really nice functionality where we can point httpx at a WSGI or ASGI app. Really I'd like to replace starlette's requests-based test client with httpx. One area where the transport API doesn't fully support us here is that for templated responses, you'd really like to be able to inspect eg response.template == 'index.html' and response.context == {...}, rather than only having the raw rendered HTML to work with. The Transport API doesn't have any way of passing additional out-of-bounds information through it, so there's no great way to do this. (Even though the ASGI spec could end up with an extension that does support it at the ASGI layer - django/asgiref#135)

As a result, here's what I'm thinking...

def request(method, url, headers, stream, context):
    return (status_code, headers, stream, context)

Here, the context is used for transport context, that provides optional additional information.

Once you start digging there's actually quite a lot of areas where you might want to dig into additional extentions/context...

Response context

  • http_version - Optional.
  • reason_phrase - Optional. Not used for HTTP/2. Most clients will just want the status code, but if you want to expose exactly what comes back from the network, then this is useful. Feels rather nice that we can drop this out into an optional, since many transports won't care about providing it at all.
  • Server push
  • Trailing headers
  • Sendfile
  • Test info including template name, template context
  • Raw connection, for CONNECT/Upgrade requests

Request context

  • timeout - Feels a bit nicer having this in the context.
  • http_version/client_addr/mount_path - If the Transport API was being called into by a server rather than a client, we would want to populate these, in order to encompass the full set of info that WSGI/ASGI provides.

The raw connection one if particularly interesting, since it would allow us to support CONNECT & Upgrade & HTTP/2 bi-directional streaming #1150 without needing a new kind of method on the transport API in order to do so. Eg...

with transport.request(method, url, headers, stream, context) as response:
    status_code, headers, stream, context = response
    if status_code >= 400:
        ...  # request was declined

    # CONNECT requests, Upgrade requests, and HTTP/2 requests may provide
    # an interface onto the raw connection. For HTTP/2 the "connection" will represent
    # a single data stream. Once we have a connection we can perform
    # raw I/O on it.
    connection = context["connection"]
    while True:
        outgoing = input()
        print(f">>> {outgoing}")
        connection.send(outgoing.encode("utf-8")
        incoming = connection.recv()
        print(f"<<< {incoming}")

This allows third parties to build components such as websockets or gRPC on top of httpx. It might also allow us to revisit our proxy implementations, so that they're properly compose with the connection pool, rather than subclassing it and hooking into bit of otherwise private behaviour.

3. Transport API as a context manager

I think I'm fairly sold on #145. Note that 2 together with 3 mean that stream really can just by Iterable[bytes]/AsyncIterable[bytes], since we don't need a close() method on the byte stream, and we wouldn't ever be planning on stuffing extra methods onto the stream interface, such as .get_trailing_headers.

@tomchristie
Copy link
Member Author

I think we probably ought to get a bit more lenient on the allowable types.

For instance... h11 is really careful about it's types too, but allows either bytes or ascii-only strings on the inputs.

We could do the same thing and...

  • Allow either bytes or ascii-only strings for method
  • Allow either bytes or ascii-only strings for the headers key/values.
  • Allow URL to be either a raw parsed URL tuple or a str.

The nice thing together with httpx.HTTPTransport is that we end up in a place where we've got a really tightly defined low-level transport interface, but it's also really nice and user friendly.

We could??? even consider adding mandatory headers automatically for the case where headers = None...

  • Default headers=None to include an automatic Host header.
  • Default headers=None to include an automatic Transfer-Encoding: chunked if stream is not None.
# Make a request using the transport API directly.
transport = httpx.HTTPTransport()
with transport.request("GET", "https://www.example.com") as response:
    status_code, headers, stream, ext = response
    body = b"".join(stream)

@florimondmanca
Copy link
Member

@tomchristie Just to confirm something I'm not yet sure about…

Eventually, are we meaning to:

  • Have a single httpcore.HTTPTransport class.
  • Or make sure httpcore.SyncHTTPTransport and httpcore.AsyncHTTPTransport have a different request/close entrypoint (request + close, vs arequest + aclose) (which is done now) so that an httpx.HTTPTransport can be added, implementing both interfaces into a single class API?

In case 1/ this would lead us to reconsider a big bunch of things, such as our unasync approach, and the general way we'd expect people to build transports, not thinking about breaking API changes, so… Just wanted to clarify that aspect. :)

@tomchristie
Copy link
Member Author

I don't expect us to have a single httpcore.HTTPTransport class, nope.
It's more making sure that they have different entry points, yup, so that...

  • We can have a single httpx.MockTransport, that works for both cases.
  • Intermediaries such as CacheControl can have implementations that cover both cases.
  • We could potentially have a single httpx.HTTPTransport, that works for both cases, dispatching to whichever is required. (Tho I don't know if we want that for sure vs. httpx.HTTPTransport + httpx.AsyncHTTPTransport, or not.)

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Oct 1, 2020
0.11.1

Fixed
- Add await to async semaphore release() coroutine
- Drop incorrect curio classifier


0.11.0

The Transport API with 0.11.0 has a couple of significant changes.

Firstly we've moved changed the request interface in order to allow extensions, which will later enable us to support features
such as trailing headers, HTTP/2 server push, and CONNECT/Upgrade connections.

The interface changes from:

```python
def request(method, url, headers, stream, timeout):
    return (http_version, status_code, reason, headers, stream)
```

To instead including an optional dictionary of extensions on the request and response:

```python
def request(method, url, headers, stream, ext):
    return (status_code, headers, stream, ext)
```

Having an open-ended extensions point will allow us to add later support for various optional features, that wouldn't otherwise be supported without these API changes.

In particular:

* Trailing headers support.
* HTTP/2 Server Push
* sendfile.
* Exposing raw connection on CONNECT, Upgrade, HTTP/2 bi-di streaming.
* Exposing debug information out of the API, including template name, template context.

Currently extensions are limited to:

* request: `timeout` - Optional. Timeout dictionary.
* response: `http_version` - Optional. Include the HTTP version used on the response.
* response: `reason` - Optional. Include the reason phrase used on the response. Only valid with HTTP/1.*.

See encode/httpx#1274 (comment) for the history behind this.

Secondly, the async version of `request` is now namespaced as `arequest`.

This allows concrete transports to support both sync and async implementations on the same class.

Added
- Add curio support.
- Add anyio support, with `backend="anyio"`.

Changed
- Update the Transport API to use 'ext' for optional extensions.
- Update the Transport API to use `.request` and `.arequest` so implementations can support both sync and async.


0.10.2

Added
- Added Unix Domain Socket support.

Fixed
- Always include the port on proxy CONNECT requests.
- Fix `max_keepalive_connections` configuration.
- Fixes behaviour in HTTP/1.1 where server disconnects can be used to signal the end of the response body.

0.10.1
- Include `max_keepalive_connections` on `AsyncHTTPProxy`/`SyncHTTPProxy` classes.


0.10.0

The most notable change in the 0.10.0 release is that HTTP/2 support is now fully optional.

Use either `pip install httpcore` for HTTP/1.1 support only, or `pip install httpcore[http2]` for HTTP/1.1 and HTTP/2 support.

Added
- HTTP/2 support becomes optional.
- Add `local_address=...` support.
- Add `PlainByteStream`, `IteratorByteStream`, `AsyncIteratorByteStream`. The `AsyncByteSteam` and `SyncByteStream` classes are now pure interface classes.
- Add `LocalProtocolError`, `RemoteProtocolError` exceptions.
- Add `UnsupportedProtocol` exception.
- Add `.get_connection_info()` method.
- Add better TRACE logs.

Changed
- `max_keepalive` is deprecated in favour of `max_keepalive_connections`.

Fixed
- Improve handling of server disconnects.
@tomchristie
Copy link
Member Author

Closed via #1522

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user-experience Ensuring that users have a good experience using the library
Projects
None yet
Development

No branches or pull requests

2 participants