Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow setting an explicit multipart boundary via headers #2278

Merged
merged 21 commits into from Aug 15, 2022

Conversation

adriangb
Copy link
Member

Closes #2235

Comment on lines 75 to 100
@pytest.mark.parametrize(
"header",
[
"multipart/form-data; charset=utf-8",
"multipart/form-data; charset=utf-8; ",
],
)
def test_multipart_header_without_boundary(header: str) -> None:
client = httpx.Client(transport=httpx.MockTransport(echo_request_content))

files = {"file": io.BytesIO(b"<file content>")}
headers = {"content-type": header}
response = client.post("http://127.0.0.1:8000/", files=files, headers=headers)
assert response.status_code == 200

# We're using the cgi module to verify the behavior here, which is a
# bit grungy, but sufficient just for our testing purposes.
boundary = response.request.headers["Content-Type"].split("boundary=")[-1]
content_length = response.request.headers["Content-Length"]
pdict: dict = {
"boundary": boundary.encode("ascii"),
"CONTENT-LENGTH": content_length,
}
multipart = cgi.parse_multipart(io.BytesIO(response.content), pdict)

assert multipart["file"] == [b"<file content>"]
Copy link
Member Author

@adriangb adriangb Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is fails on main/master as well, but I think it would be important to fix in this PR. What needs to happen is we need to create the random boundary and then insert that into the existing header. The issue is that there is currently no machinery / precedent for encode_request() modifying headers, so it would take a bit more refactoring to get working. Alternatively, we can raise an error if a user gives us a content-type: multipart/form-data without an explicit header?

@@ -187,6 +198,7 @@ def encode_request(
files: Optional[RequestFiles] = None,
json: Optional[Any] = None,
boundary: Optional[bytes] = None,
headers: Optional[Mapping[str, str]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we keep things a little tighter here by passing content_type: str = None instead? That restricts the information we're passing around to the one thing we're actually interested in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup good call

for section in content_type.split(";"):
if section.strip().startswith("boundary="):
return section.strip().split("boundary=")[-1].encode("latin-1")
raise ValueError("Missing boundary in multipart/form-data content-type header")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of raising ValueError we could just return None if the content type doesn't start with "multipart/form-data", or doesn't include a valid boundary.

(Users can currently submit requests with Content-Type headers that don't properly match up to the data they include, so just being lax here seems okay to me.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that since content-type gets set via setdefault. If it already exists, it won't be overwritten. So in the case where the user provided the content-type header but did not provide the multipart boundary we'd be sending out a request with no multipart boundary, which completely violates the spec and no server will be able to parse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the case where the user provided the content-type header but did not provide the multipart boundary we'd be sending out a request with no multipart boundary, which completely violates the spec and no server will be able to parse.

True. We could either hard-error in that case, or just ignore it and allow users to do that.

There is also a whole class of cases here where a user can set a Content-Type that doesn't match the encoding type they're using with data=/files=/json=. (Eg. they can set Content-Type: application/json on either form or multipart requests.) But that's probably okay.

httpx/_content.py Outdated Show resolved Hide resolved
Copy link
Member

@tomchristie tomchristie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

@adriangb
Copy link
Member Author

adriangb commented Jul 2, 2022

@tomchristie this is ready for another review whenever you get a chance

@adriangb adriangb requested a review from a team August 13, 2022 19:41
httpx/_content.py Outdated Show resolved Hide resolved
httpx/_models.py Outdated Show resolved Hide resolved
httpx/_multipart.py Outdated Show resolved Hide resolved
httpx/_multipart.py Outdated Show resolved Hide resolved
Co-authored-by: Jean Hominal <jhominal@gmail.com>
Copy link
Member

@jhominal jhominal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is good for me.

Thank you for your patience!

@adriangb
Copy link
Member Author

Thank you for your patience!

Thank you for your thorough and timely reviews!

@adriangb adriangb merged commit 1526048 into encode:master Aug 15, 2022
@adriangb adriangb deleted the explicit-boundry branch August 15, 2022 15:20
@tomchristie tomchristie mentioned this pull request Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants