Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transport API #1522

Merged
merged 26 commits into from Mar 24, 2021
Merged

Transport API #1522

merged 26 commits into from Mar 24, 2021

Conversation

tomchristie
Copy link
Member

@tomchristie tomchristie commented Mar 22, 2021

We have a little niggle in our API. Almost all the way through httpcore is treated as an implementation detail, that just happens to be used by the default transport implementation, httpx.HTTPTransport.

The one place where this isn't the case is in our "Transport API". In that case we're explicitly documenting "subclass httpcore.SyncHTTPTransport and implement it's API".

To make this clearer, here's how the "custom transports" docs look after this change...

(Note I've updated this description to also incorporate the follow-up in #1550)


Writing custom transports

A transport instance must implement the low-level Transport API, which deals with sending a single request, and returning a response. You should either subclass httpx.BaseTransport to implement a transport to use with Client, or subclass httpx.AsyncBaseTransport to implement a transport to use with AsyncClient.

At the layer of the transport API we're simply using plain primitives. No Request or Response models, no fancy URL or Header handling. This strict point of cut-off provides a clear design separation between the HTTPX API, and the low-level network handling.

See the handle_request and handle_async_request docstrings for more details on the specifics of the Transport API.

A complete example of a custom transport implementation would be:

import json
import httpx


class HelloWorldTransport(httpx.BaseTransport):
    """
    A mock transport that always returns a JSON "Hello, world!" response.
    """

    def handle_request(self, method, url, headers, stream, extensions):
        message = {"text": "Hello, world!"}
        content = json.dumps(message).encode("utf-8")
        stream = httpx.ByteStream(content)
        headers = [(b"content-type", b"application/json")]
        extensions = {}
        return 200, headers, stream, extensions

Which we can use in the same way:

>>> import httpx
>>> client = httpx.Client(transport=HelloWorldTransport())
>>> response = client.get("https://example.org/")
>>> response.json()
{"text": "Hello, world!"}

Examples of calling into a transport instance directly

Standard case:

with httpx.HTTPTransport() as transport:
    status_code, headers, stream, extensions = transport.handle_request(
        method=b'GET',
        url=(b'https', b'www.example.com', 443, b'/'),
        headers=[(b'Host', b'www.example.com')],
        stream=httpx.ByteStream(b""),
        extensions={}
    )
    body = stream.read()

Streaming case:

with httpx.HTTPTransport() as transport:
    status_code, headers, stream, extensions = transport.handle_request(
        method=b'GET',
        url=(b'https', b'www.example.com', 443, b'/'),
        headers=[(b'Host', b'www.example.com')],
        stream=httpx.ByteStream(b""),
        extensions={}
    )
    try:
        body = [part for part in stream]
    finally:
        stream.close()

Here's a checklist of differences between the previous interface, for developers of custom transports.

  • Transports should subclass httpx.BaseTransport or httpx.AsyncBaseTransport, not the previous httpcore classes.
  • The override method is now handle_request()/handle_async_request(), not request()/arequest(). I think it's far cleaner not to have a name clash vs. client.request().
  • The 'ext' argument is now named 'extensions'. Because.
  • The 'headers', 'stream', and 'extensions' arguments are no longer optional. Helps simplify implementations marginally.
  • Transports should raise exceptions from the httpx.TransportError hierarchy, instead of httpcore exceptions. Note that httpcore->httpx exception mapping is no longer provided by the client, but instead is handled more correctly within the HTTPTransport() implementation.
  • Streams are now subclasses of either httpx.SyncByteStream/httpx.AsyncByteStream. For simple cases the concrete implementation httpx.ByteStream() provides for returning some fixed byte content, supporting both sync and async interfaces.
  • We now use reason_phrase instead of reason as an extension key, because it then properly follows the HTTP spec naming.
  • The reason_phrase and http_version extensions are now bytes, instead of str, for consistency with other elements in the interface, and because, when present in the response they are bytes on the wire. (Note that HTTP/2 onwards does not include version or reason information.)

Follow-ups that we're likely to want, that we can treat as seperate PRs once this is in...

  • We'll want to issue an httpcore 0.13 version that also has an equivalent API update. Not strictly necessary, but it'd be neater.
  • We probably want to introduce a convenience method transport.send_request(method, url, ...) that provides a nice httpx-level API onto transports, and returns fully fledged Response instances. This would be implemented once on the base classes, and would allow users to test transport instances directly. Eg. r = t.send_request("GET", "https://www.example.com") (Also note the naming send_request()/handle_request(), makes sense together, and avoids clashes with any methods on the client)

@StephenBrown2
Copy link
Contributor

Looking real nice!

The 'ext' argument is now named 'extensions'. Because.

That bugged me too. Glad to see it.

Also in favor of the follow-ups, all seem reasonable and beneficial.

* Push httpcore exception wrapping out of client into transport

* Include close/aclose extensions in docstring

* Comment about the request property on RequestError exceptions
* Extensions reason_phrase and http_version as bytes

* Update BaseTransport docstring
@tomchristie
Copy link
Member Author

I think this one is ready to go now.

@florimondmanca not sure what your thoughts on context managed interface at this layer are, but even if we do want to discuss that further I think we'd be way better off doing that with this as the starting point. The following core points all apply just as well regardless of if handle_request is a regular function call, or if it is a context manager...

  • The Transport API should be an HTTPX interface.
  • Exceptions raised by transports should be HTTPX exceptions. The client shouldn't have any baked-in httpcore knowledge.
  • Let's use plain byte iterables for stream interfaces.
  • We'd like to call this handle_request in preference to request.
  • We don't need optional arguments at this interface layer.
  • Let's use "extensions" not "ext"
  • Let's use "reason_phrase" not "reason"
  • Specify "reason_phrase" and "http_version" as bytes for preserving raw off-the-wire info.

@tomchristie
Copy link
Member Author

tomchristie commented Mar 23, 2021

Presumably this proposed 0.18 API will be relevant to @lundberg (Because RESPX) and @johtso (Some work on an HTTP Caching transport, I think?)

@florimondmanca
Copy link
Member

florimondmanca commented Mar 23, 2021

@tomchristie Most of the points I can agree with, two questions:

  • What would be the status of HTTPCore at this point? Mostly an implementation that works well if wrapped around the HTTPX transport API?
  • Not sure I followed the rationale for handle_request. Is that just a naming nit, or is it chosen not to collide with request in the existing HTTPCore implementation? Or some other motivation?

@tomchristie
Copy link
Member Author

tomchristie commented Mar 23, 2021

There's not really any difference in httpcore except that from the perspective of HTTPX users it's now entirely an implementation detail. We're not leaking any httpcore specific stuff into the HTTPX Transport API anymore.

The handle_request makes sense to me in for two reasons:

  1. In order to properly differentiate it from the request method on the client.
  2. So that we can introduce a convenience response = t.send_request("GET", "http://www.example.com") method on the transport class. (A nice usability one for testing etc). A handle_request/send_request pair makes way more sense than a request/send_request pair.

@florimondmanca
Copy link
Member

Sounds good. :-)

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very satisfying. 😄

Leaving some nits, and a somewhat important one about Iterator vs Iterable in the Transport API signature.

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
httpx/_transports/base.py Outdated Show resolved Hide resolved
httpx/_transports/base.py Outdated Show resolved Hide resolved
httpx/_transports/base.py Outdated Show resolved Hide resolved
tomchristie and others added 3 commits March 24, 2021 09:32
Co-authored-by: Florimond Manca <florimond.manca@gmail.com>
Co-authored-by: Florimond Manca <florimond.manca@gmail.com>
@johtso
Copy link
Contributor

johtso commented Mar 24, 2021

Looks great! Will have a play trying to use this in httpx-caching and see what the diff looks like :)

@johtso
Copy link
Contributor

johtso commented Mar 24, 2021

Streams are now plain typing.Iterable[byte]/typing.AsyncIterable[byte]. Transports that require a close callback can use the close/aclose extension.

So am I right in thinking there is no longer a concept of .closeing a stream?

@tomchristie
Copy link
Member Author

So there's a close or aclose callback that may be present in the extensions...

https://github.com/encode/httpx/blob/master/httpx/_transports/base.py#L89

For your codebase it looks to me like where you're currently using response.stream.close() you'll instead want close_callback = response.ext.get('close') then if close_callback is not None: close_callback().

(Or something similar)

@johtso
Copy link
Contributor

johtso commented Mar 24, 2021

Thanks @tomchristie! I just figured this out after I saw errors when trying to serialize extensions :)

@johtso
Copy link
Contributor

johtso commented Mar 24, 2021

So, that was fairly painless! https://github.com/johtso/httpx-caching/pull/4/files
I'm still using httpcore.PlainByteStream to wrap the response body in an async agnostic way when retrieved from the cache..

@lundberg
Copy link
Contributor

lundberg commented Apr 8, 2021

Now RESPX also has a PR lundberg/respx#142 respecting this new transport API!

@tomchristie
Copy link
Member Author

tomchristie commented Apr 8, 2021

@lundberg Fantastic! It'd be worth you (and @johtso) taking a look over #1550, which I'm rather keen on.

@balki
Copy link

balki commented May 1, 2021

This change silently breaks users of existing httpcore transport users by appearing to 'work' fine.
E.g. https://github.com/romis2012/httpx-socks/blob/v0.3.1/httpx_socks/_async_transport.py#L11
AsyncConnectionPool happens to have handle_async_request method which gets called instead of the arequest method.
This means the request does not go via the socks proxy but goes directly. Tor being a common usecase for socks proxy, with this change, users who thought they are connecting via tor are now exposed.

May be add a check like

if not isinstance(transport, httpx.AsyncBaseTransport): raise ....

@romis2012
Copy link

This change silently breaks users of existing httpcore transport users by appearing to 'work' fine.
E.g. https://github.com/romis2012/httpx-socks/blob/v0.3.1/httpx_socks/_async_transport.py#L11

This issue was fixed in httpx-socks v0.4.0

MarcAbonce added a commit to MarcAbonce/searx that referenced this pull request Jul 16, 2021
update httpx-socks to 0.4.0 to fix:
encode/httpx#1522 (comment)

remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
MarcAbonce added a commit to MarcAbonce/searxng that referenced this pull request Jul 16, 2021
update httpx-socks to 0.4.0 to fix:
encode/httpx#1522 (comment)

remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
MarcAbonce added a commit to MarcAbonce/searx that referenced this pull request Jul 25, 2021
downgrade httpx:
PR encode/httpx#1522
made some breaking changes in AsyncHTTPTransport that affect
our code in https://github.com/searx/searx/blob/master/searx/network/client.py

remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
kvch pushed a commit to searx/searx that referenced this pull request Aug 2, 2021
downgrade httpx:
PR encode/httpx#1522
made some breaking changes in AsyncHTTPTransport that affect
our code in https://github.com/searx/searx/blob/master/searx/network/client.py

remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
dalf added a commit to dalf/searxng that referenced this pull request Aug 17, 2021
adjust searx.network module to the new internal API
see encode/httpx#1522
dalf added a commit to dalf/searxng that referenced this pull request Aug 18, 2021
adjust searx.network module to the new internal API
see encode/httpx#1522
dalf added a commit to dalf/searxng that referenced this pull request Aug 24, 2021
adjust searx.network module to the new internal API
see encode/httpx#1522
dalf added a commit to dalf/searxng that referenced this pull request Sep 16, 2021
adjust searx.network module to the new internal API
see encode/httpx#1522
dalf added a commit to dalf/searxng that referenced this pull request Sep 17, 2021
adjust searx.network module to the new internal API
see encode/httpx#1522
dalf added a commit to dalf/searxng that referenced this pull request Sep 17, 2021
adjust searx.network module to the new internal API
see encode/httpx#1522
MarcAbonce added a commit to MarcAbonce/searx that referenced this pull request Sep 23, 2021
downgrade httpx:
PR encode/httpx#1522
made some breaking changes in AsyncHTTPTransport that affect
our code in https://github.com/searx/searx/blob/master/searx/network/client.py

remove not_evil which has been down for a while now:
https://old.reddit.com/r/onions/search/?q=not+evil&restrict_sr=on&t=year
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants