Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom JSON library #717

Open
victoraugustolls opened this issue Jan 3, 2020 · 32 comments · May be fixed by #3199
Open

Custom JSON library #717

victoraugustolls opened this issue Jan 3, 2020 · 32 comments · May be fixed by #3199
Labels
question Further information is requested

Comments

@victoraugustolls
Copy link

victoraugustolls commented Jan 3, 2020

Hi!

Is there a way to use an alternative JSON library to decode the request? Like orjson for example?

Thanks!

@florimondmanca florimondmanca added the question Further information is requested label Jan 3, 2020
@tomchristie
Copy link
Member

You'd need to do that explicitly. I think it'd look like this to encode the request...

httpx.post(headers={'Content-Type': 'application/json'}, data=orjson.dumps(...))

...and like this, to decode the response:

orjson.loads(response.text)

@florimondmanca
Copy link
Member

florimondmanca commented Jan 3, 2020

Alternatively, for a more automated solution, you could probably get away with a sys.modules hack? 😅

Here's an example — it uses a wrapper module to add verification-only print statements, but you can skip it and just use sys.modules["json"] = orjson.

# spy_orjson.py
import orjson


def loads(text):
    print("It works! Loading...")
    return orjson.loads(text)


def dumps(text):
    print("It works! Dumping...")
    return orjson.dumps(text)
# main.py
import sys
import spy_orjson

sys.modules["json"] = spy_orjson

import httpx

request = httpx.Request("GET", "https://example.org")
content = b'{"message": "Hello, world"}'
response = httpx.Response(
    200, content=content, headers={"Content-Type": "application/json"}, request=request
)

print(response.json())

Output:

$ python main.py
It works! Loading...

@victoraugustolls
Copy link
Author

That's great! Thanks!!

@dmig
Copy link

dmig commented Oct 2, 2020

@florimondmanca such an ugly hack...

Why not implement this feature? Looking at orjson and simdjson libraries, this may be used to improve performance a lot.

I'll try to implement this.

@tomchristie
Copy link
Member

I'd be okay with us providing an easy way to patch this in, if it follows something similar to how requests allows for this... psf/requests#1595

@dmig-alarstudios
Copy link

I'm looking at the code currently, don't see an easy way...

Probably I'll create a httpx.jsonlib with loads and dumps, which may be overridden later. Not the cleanest solution, but will allow to use e.g.:

  • orjson for loads and dumps (which returns bytes)
  • simdjson for loads and orjson for dumps

@Kludex
Copy link
Sponsor Member

Kludex commented Dec 29, 2022

There were two PRs closed because they were stale, so I'm going to just reopen this one for us to have a conclusion.

What about adding a new parameter to the client? Something like json_lib? Was this discarded already?

import httpx
import orjson

httpx.Client(json_lib=orjson)

@zanieb
Copy link
Contributor

zanieb commented Dec 29, 2022

Maybe it'd be best to be able to specify dumps/loads separately both for user control and to avoid doing getattr to get the dumps/loads methods? Perhaps:

httpx.Client(json_encoder=orjson.dumps, json_decoder=orjson.loads)

@islam-aymann
Copy link

Maybe it'd be best to be able to specify dumps/loads separately both for user control and to avoid doing getattr to get the dumps/loads methods? Perhaps:

httpx.Client(json_encoder=orjson.dumps, json_decoder=orjson.loads)

It would be great to use the same names of Pydantic

httpx.Client(json_loads=orjson_loads, json_dumps=orjson_dumps)

@rikroe
Copy link

rikroe commented Feb 22, 2023

If somebody else comes accross this, to be compatible with mypy and the have correct typing one has to use content instead of data as suggested originally:

httpx.post(headers={'Content-Type': 'application/json'}, content=orjson.dumps(...))

@xbeastx
Copy link

xbeastx commented Aug 1, 2023

3 years later...
so even it has pull request to implementing this #1352
why it was closed?

@zanieb
Copy link
Contributor

zanieb commented Aug 1, 2023

@xbeastx I think it's quite clearly articulated at #1352 (comment) why that pull request went stale.

There is some additional helpful context at #1730 (comment) and discussion at #1740

@tomchristie
Copy link
Member

tomchristie commented Aug 3, 2023

I've not been sufficiently happy with any of the API proposal so far, and I've essentially been veto'ing them.

Let me nudge something here that could be viable(?)...

client = httpx.Client(request_class=..., response_class=...)

I can explain why I (potentially) like that if needed. Perhaps the design sense will speak for itself.


Edit 8th Sept 2023:

That API would allow for this kind of customization...

class APIClient(httpx.Client):
    request_class = APIRequest
    response_class = APIResponse

class APIRequest(httpx.Request):
    def __init__(self, *args, **kwargs):
        if 'json' in kwargs:
            content = orjson.dumps(kwargs.pop('json'))
            headers = kwargs.get('headers', {})
            headers['Content-Length'] = len(content)
            kwargs['content'] = content
            kwargs['headers'] = headers
        return super().__init__(*args, **kwargs)

class APIResponse(httpx.Response):
    def json(self):
        return orjson.loads(self.content)

@T-256

This comment was marked as outdated.

@DeadWisdom
Copy link

I'm hitting this issue as I type, and yikes, this is so complicated. 95% of use-cases would be solved if you could just do something like httpx.set_json_handlers(loads=orjson.loads, dumps=orjson.dumps). I'm not doing this on a per Client basis. If I'm using orjson, I'm using orjson everywhere. Also, if I need to be fancy, I can wrap it up in another function.

@tomchristie
Copy link
Member

95% of use-cases would be solved if you could just do something like httpx.set_json_handlers(loads=orjson.loads, dumps=orjson.dumps)

I do see that. The issue with that approach is that you introduce subtly different JSON handling at a distance. Installing a new dependancy to your project could end up altering the behaviour of an API client without that being visible anywhere obvious in the project codebase.

I'm not doing this on a per Client basis.

Do you have more than one client instance across the codebase?

@dmig
Copy link

dmig commented Sep 8, 2023

Do you have more than one client instance across the codebase?

This is a very normal situation in microservice environment. This is a reason this issue exists.

@tomchristie
Copy link
Member

tomchristie commented Sep 8, 2023

This comment suggests an API that I wouldn't object too.

Once you've added that code you'd be able to use APIClient instead of httpx.Client everywhere throughout the project.

It's not exactly what some of y'all are requesting, but the critical sticking point here is this: I can't see myself doing anything other than veto'ing proposals that use a form of global state.

@zanieb
Copy link
Contributor

zanieb commented Sep 8, 2023

I strongly agree that global state is not a good path forward for the library. I like the request_class and response_class approach — that would also help with some other issues like custom wrappers for response errors.

For those who want to configure the JSON library globally in your projects, it'd be trivial to subclass the httpx.Client as described or wrap client retrieval in a helper method.

@DeadWisdom
Copy link

Do you have more than one client instance across the codebase?

Well yes, I'm doing with httpx.AsyncClient() as client all the time.

It'd be trivial to subclass the httpx.Client as described or wrap client retrieval in a helper method.

That's probably what I'll do, just wrap the client. It's not immediately obvious that this is what you should do, though. Maybe make it a recipe in the docs? At least until there is a settled solution.

Overall, I'll say this is a classic case of pragmatism vs purity and I'm not sure a convenience function is where you want to spend cycles achieving purity. But that's not for me to say, and I appreciate your hard work and trust you'll make the best decision. Thank you.

@illeatmyhat
Copy link

+1. We have datetime.date objects in our JSON, and while there's nothing wrong with writing a JSON Encoder, it seems not very ideal to have to do
client.get(..., data=json.dumps(..., cls=MyEncoder)) every single time.

@chbndrhnns
Copy link

chbndrhnns commented Feb 14, 2024

My use case is sending pydantic models from my test to a webapp using the httpx client. I am currently doing a roundtrip conversion for each params or json argument to get rid of custom types.

@dmig
Copy link

dmig commented Feb 14, 2024

@chbndrhnns well, your case seems to be simple: https://docs.pydantic.dev/latest/concepts/serialization/#modelmodel_dump_json -- just overload this (or model_dump depending on your needs)

@chbndrhnns
Copy link

chbndrhnns commented Feb 14, 2024

your case seems to be simple:

Ok, let's take this as a simplified example for my use case:

import httpx
from pydantic import BaseModel


def test():
    class Address(BaseModel):
        zip: str
        street: str
        city: str

    payload = {
        "name": "me",
        "address": Address(zip="0000", street="this street", city="my city")
    }
    _ = httpx.post("http://127.0.0.1:8000/", json=payload)

It fails unless I call model_dump_json() on each value which is not a stdlib type

E       TypeError: Object of type Address is not JSON serializable

@illeatmyhat
Copy link

illeatmyhat commented Feb 15, 2024

It sounds like you don't own this web app, but normally you should define the payload in pydantic as well and call model_dump() on the root.
Easiest way to solve the problem IMO.
Then if you want some mental gymnastics, you can override the httpx function to call model_dump() on pydantic models, but that may be a step too far for some maintainers.

If that sounds tedious to you, consider that strong typing is generally a compromise of being tedious in exchange for being correct
Pydantic 2 also introduced model_serializer and field_serializer so you don't have to override JSONEncoder

@DeoLeung
Copy link

I've not been sufficiently happy with any of the API proposal so far, and I've essentially been veto'ing them.

Let me nudge something here that could be viable(?)...

client = httpx.Client(request_class=..., response_class=...)

I can explain why I (potentially) like that if needed. Perhaps the design sense will speak for itself.

Edit 8th Sept 2023:

That API would allow for this kind of customization...

class APIClient(httpx.Client):
    request_class = APIRequest
    response_class = APIResponse

class APIRequest(httpx.Request):
    def __init__(self, *args, **kwargs):
        if 'json' in kwargs:
            content = orjson.dumps(kwargs.pop('json'))
            headers = kwargs.get('headers', {})
            headers['Content-Length'] = len(content)
            kwargs['content'] = content
            kwargs['headers'] = headers
        return super().__init__(*args, **kwargs)

class APIResponse(httpx.Response):
    def json(self):
        return orjson.loads(self.content)

having the ability to customize the response class will be great, any schedule on its implementation? :)

@seandstewart
Copy link

One consideration that I haven't seen proposed - it's entirely reasonable for this library to check if the value of json is already encoded to bytes. In that case, you can skip calling json.dumps here:

httpx/httpx/_content.py

Lines 176 to 181 in 7354ed7

def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
body = json_dumps(json).encode("utf-8")
content_length = str(len(body))
content_type = "application/json"
headers = {"Content-Length": content_length, "Content-Type": content_type}
return headers, ByteStream(body)

so it could look something like:

 def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
    body = json if isinstance(json, bytes) else json_dumps(json).encode("utf-8")
    content_length = str(len(body))
    content_type = "application/json"
    headers = {"Content-Length": content_length, "Content-Type": content_type}
    return headers, ByteStream(body)

This would allow developers the ability to handle json encoding and decoding external to the library, so we could do something like this:

import httpx
import orjson

mydata = {...}
with httpx.ClientSession() as client:
    encoded = orjson.dumps(mydata)
    response = client.post("https://fake.url/data/", json=encoded)
    result = orjson.loads(response.content)

I know for a fact aiohttp does something similar, so there is precedent here. (It also allows you to pass in a json encoder and decoder, but we've been down that road here.)

If this seems like a reasonable change, I'm happy to make the requisite PR.

@gtors
Copy link

gtors commented Apr 19, 2024

Any news? Why not just borrow a solution from aiohttp?

aiohttp.ClientSession(json_serialize=..., ...)

@gtors
Copy link

gtors commented Apr 19, 2024

🌟 Introducing HTTPJ! 🚀 It's like HTTPX, but with built-in support for flexible JSON serialization/deserialization!

pip install httpj orjson
import datetime
import pprint

import httpj
import orjson


resp = httpj.post(
    "https://postman-echo.com/post",
    json={"dt": datetime.datetime.utcnow()},
    json_serialize=lambda j: orjson.dumps(j, option=orjson.OPT_NAIVE_UTC),  # optional
    json_deserialize=orjson.loads,  # optional
)
pprint.pprint(resp.json(), indent=4)

p.s.: I'm tired of waiting for this feature for more than 4 years...

@seandstewart
Copy link

I added the following snippet to my client module to allow for passing in bytes as json. Not a huge fan of monkey-patching, but it get the job done.

def _patch_httpx():  # type: ignore
    """Monkey-patch httpx so that we can use our own json ser/des.

    https://github.com/encode/httpx/issues/717
    """
    from httpx._content import Any, ByteStream, json_dumps

    def encode_json(json: Any) -> tuple[dict[str, str], ByteStream]:
        body = json if isinstance(json, bytes) else json_dumps(json).encode("utf-8")
        content_length = str(len(body))
        content_type = "application/json"
        headers = {"Content-Length": content_length, "Content-Type": content_type}
        return headers, ByteStream(body)

    # This makes the above function look and act like the original.
    encode_json.__globals__.update(httpx._content.__dict__)
    encode_json.__module__ = httpx._content.__name__
    httpx._content.encode_json = encode_json


_patch_httpx()

@tomchristie
Copy link
Member

I'm tired of waiting for this feature for more than 4 years...

So, here's an API proposal.

Yep, I'll happily help someone get a pull request merged against that proposal.

@q0w
Copy link

q0w commented May 11, 2024

Does it mean that httpx.Client should be now Generic[RequestType, ResponseType] ?

@q0w q0w linked a pull request May 11, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet