Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support async file types in files = {} and content = ... #1620

Open
tomchristie opened this issue Apr 30, 2021 · 21 comments
Open

Support async file types in files = {} and content = ... #1620

tomchristie opened this issue Apr 30, 2021 · 21 comments
Labels
enhancement New feature or request

Comments

@tomchristie
Copy link
Member

We ought to support the following cases.

Raw upload content from an async file interface:

import httpx
import trio

async def main():
    async with httpx.AsyncClient() as client:
        async with await trio.open_file(...) as f:
            client.post("https://www.example.com", content=f)

trio.run(main)

Multipart file upload from an async file interface:

import httpx
import trio

async def main():
    async with httpx.AsyncClient() as client:
        async with await trio.open_file(...) as f:
            client.post("https://www.example.com", file={"upload": f})

trio.run(main)

We probably want to ensure that we're supporting both trio, anyio (Which have the same interfaces), and perhaps also `aiofiles. So eg, also supporting the following...

# Supporting the same as above but using `asyncio`, with `anyio` for the file operations.
import anyio
import asyncio
import httpx

async def main():
    async with httpx.AsyncClient() as client:
        async with await anyio.open_file(...) as f:
            client.post("https://www.example.com", content=f)

asyncio.run(main())

The content=... case is a little simpler than the data=... case, since it really just need an async variant of peek_filelike_length, and a minor update to the ._content.encode_content() function.

Also fiddly is what the type annotations ought to look like.

@tomchristie tomchristie added the enhancement New feature or request label Apr 30, 2021
@Mayst0731
Copy link
Contributor

Hi, I'm interested in this issue and also found it in the ._content.encode_content() function, the first thing need to do is to find the type of trio and anyio. Could I have a try for getting it done? :D

@Mayst0731
Copy link
Contributor

Ohhh, like you said its multipart issue seems like aiohttp working with multipart as well.
("https://docs.aiohttp.org/en/stable/multipart.html")

@ajayd-san
Copy link

hey, @meist0731 are you still working on this issue ? If yes, can we work it together. Seems like an interesting problem

@Mayst0731
Copy link
Contributor

hey, @meist0731 are you still working on this issue ? If yes, can we work it together. Seems like an interesting problem

Yep! I'm still working on this. It's my honor to work together with you :DDD I'm going to sleep soon and tomorrow I will share the materials I've searched before.

@ajayd-san
Copy link

ajayd-san commented Jun 11, 2021

@meist0731 cheers, you on discord ? itll be easier to work together.

@Mayst0731
Copy link
Contributor

@meist0731 cheers, you on discord ? itll be easier to work together. my id - Krunchy_Almond#2794

Gotcha! I have discord, wait a sec, bro.

@Mayst0731
Copy link
Contributor

@meist0731 cheers, you on discord ? itll be easier to work together. my id - Krunchy_Almond#2794

Hey, I've sent the invitation :D

@ajayd-san
Copy link

@tomchristie how do you recommend to proceed this issue? Like can you explain where to start and all ?

@Mayst0731
Copy link
Contributor

Mayst0731 commented Jun 24, 2021

I've tried these APIs as below,

async def main1():
    async with await anyio.open_file('./content.txt','rb') as f:
        await client.post("https://www.example.com", content=f)
anyio.run(main1)
async def main2():
    async with await trio.Path('./content.txt').open('rb') as f:
            await client.post("https://www.example.com", content=f)
trio.run(main2)
async def main3():
    async with aiofiles.open('./content.txt', mode='rb') as f:
        await client.post("https://www.example.com", content=f)      
asyncio.run(main3())

The multipart upload:

async def main5():
    async with httpx.AsyncClient() as client:
        async with await anyio.open_file('./content.txt','rb') as f:
            await client.post("https://www.example.com", files={"upload": f})
anyio.run(main5)

The problems here are:
(1) The above functions only support read files in "rb" mode instead of 'r', otherwise it will give a TypeError saying "sequence item 1: expected a bytes-like object, str found", but I haven't figure out which part of code is handling this.
(2) I've tested the above functions with a text file, it proved that no matter reading a file in a sync way or async way using trio, anyio or aiofiles, the .peek_filelike_length function can get the file's length correctly. However, when it comes to multipart upload, the error shows "AsyncIOWrapper"/"AsyncFile"/"AsyncBufferedReader" (Asynciterable object) is not iterable. It seems because the iteration functions here are all sync functions instead of async functions cannot support iterate an Asynciterable object but still receive Asynciterable objects, except for the final one is an async function

For example, this function accepts AsyncIterable objects but cannot perform iterations indeed. is not an async function.

@Mayst0731
Copy link
Contributor

The first problem has an existing discussion #1704 (comment)

@stale
Copy link

stale bot commented Feb 20, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Feb 20, 2022
@stale stale bot closed this as completed Mar 5, 2022
@pawamoy
Copy link
Sponsor

pawamoy commented Jun 16, 2022

Argh, stale, wontfix, nooo 😱 !

Just want to make sure: uploading a file and data as multipart is still not supported by the async client, right? I'm getting the Attempted to send an sync request with an AsyncClient instance. error message when trying to do such a thing.

import httpx
from aiofiles import open as aopen
async with aopen("somefile.zip", "rb") as fp, httpx.AsyncClient() as client:
    files = {"content": ("somefile.zip", fp, "application/octet-stream")}
    response = await client.post(
        "http://localhost:8888",
        data=data_to_send,
        files=files,
        follow_redirects=False,
        headers={"Content-Type": f"multipart/form-data; boundary={uuid4().hex}"},
    )

@florimondmanca
Copy link
Member

@pawamoy No fix was implemented AFAIK. This seems like an issue stalebot closed due to lack of inactivity, rather than us deciding it shouldn't be acted upon. I guess we can reopen (stalebot would come back in a few months) and any attempts towards supporting the interfaces described in OP (trio, anyio, aiofiles) would be welcome!

@reclosedev
Copy link

reclosedev commented Dec 12, 2022

While the issue is not resolved, I'm using following monkey-patch, maybe it will be helpful:

"""
This is workaround monkey-patch for https://github.com/encode/httpx/issues/1620

If you need to upload async stream as a multipart `files` argument, you need to apply this patch
and wrap stream with `AsyncStreamWrapper`::

    httpx_monkeypatch.apply()
    ...

    known_size = 42
    stream = await get_async_bytes_iterator_somehow_with_known_size(known_size)
    await client.post(
        'https://www.example.com',
        files={'upload': AsyncStreamWrapper(stream, known_size)})
    )
"""
import typing as t
from asyncio import StreamReader

from httpx import _content
from httpx._multipart import FileField
from httpx._multipart import MultipartStream
from httpx._types import RequestFiles


class AsyncStreamWrapper:
    def __init__(self, stream: t.Union[t.AsyncIterator[bytes], StreamReader], size: int):
        self.stream = stream
        self.size = size


class AsyncAwareMultipartStream(MultipartStream):

    def __init__(self, data: dict, files: RequestFiles, boundary: bytes = None) -> None:
        super().__init__(data, files, boundary)
        for field in self.fields:
            if isinstance(field, FileField) and isinstance(field.file, AsyncStreamWrapper):
                field.get_length = lambda f=field: len(f.render_headers()) + f.file.size  # type: ignore # noqa: E501

    async def __aiter__(self) -> t.AsyncIterator[bytes]:
        for field in self.fields:
            yield b'--%s\r\n' % self.boundary
            if isinstance(field, FileField) and isinstance(field.file, AsyncStreamWrapper):
                yield field.render_headers()
                async for chunk in field.file.stream:
                    yield chunk
            else:
                for chunk in field.render():
                    yield chunk
            yield b'\r\n'
        yield b'--%s--\r\n' % self.boundary


def apply():
    _content.MultipartStream = AsyncAwareMultipartStream

@and3rson
Copy link

Has there been any progress on this by any chance?

@lambdaq
Copy link

lambdaq commented May 19, 2023

I am also using files={"upload": f} where f is a multi-part async file upload from FastAPI.

It says TypeError: object of type 'coroutine' has no len() which kills me. The file is quite large I hope it gets handled in a streamable way.

@tomchristie
Copy link
Member Author

If anyone is invested in making this happen I can make the time to guide a pull request through.

@lambdaq
Copy link

lambdaq commented Jun 7, 2023

I am also using files={"upload": f} where f is a multi-part async file upload from FastAPI.

I solved this problem, for FastAPI. When reading a uploaded file in a form, FastAPI wraps a SpooledTemporaryFile into async style.

To access the file with httpx, the async doesn't fit, but you can use the old-fasioned way, just change

httpx.post(..., files={"upload": f})

into

httpx.post(..., files={"upload": f.file})

@yayahuman
Copy link

A monkey patch showing a possible solution (also cancels #1706 and covers #2399):

https://gist.github.com/yayahuman/db06718ffdf8a9b66e133e29d7d7965f

And possible type annotations:

from abc import abstractmethod
from typing import AnyStr, AsyncIterable, Iterable, Protocol, Union  # 3.8+


class Reader(Protocol[AnyStr]):
    __slots__ = ()
    
    @abstractmethod
    def read(self, size: int = -1) -> AnyStr:
        raise NotImplementedError


class AsyncReader(Protocol[AnyStr]):
    __slots__ = ()
    
    @abstractmethod
    async def read(self, size: int = -1) -> AnyStr:
        raise NotImplementedError


FileContent = Union[
    str,
    bytes,
    Iterable[str],
    Iterable[bytes],
    AsyncIterable[str],
    AsyncIterable[bytes],
    Reader[str],
    Reader[bytes],
    AsyncReader[str],
    AsyncReader[bytes],
]

RequestContent = FileContent

@yayahuman
Copy link

@tomchristie, can my monkey patch approach be acceptable?

@tomchristie
Copy link
Member Author

Let me help guide this conversation a bit more clearly.
I would probably suggest starting by just looking at the content=... case.
A good starting point for a pull request would be a test case for that one case, which demonstrates the behaviour we'd like to see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants