Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fields.py to use HTML5 non-ASCII File Names by Default instead of RFC2231 for encoding #1492

Merged
merged 15 commits into from
Mar 23, 2019

Conversation

Robbt
Copy link
Contributor

@Robbt Robbt commented Dec 3, 2018

This is a re-open of #856 and basically just a merge of the work that Spryttan did with the current HEAD so that it can be tested and hopefully merged after review.

I haven't spent the time fully reading the different RFC's and protocol docs but just arrived at this because I realized a module I was writing was failing on filenames that had UTF8 and so I decided to test this and this resolved the problem we were experiencing. I noticed it is failing one test but I think the test needs updating as it is trying to confirm that requests.py is converting a field to RFC2231 vs. just UTF8 encoding. And so I think the tests will pass when that is updated.

I understand that previously there was some concern about merging this without notice because it would mean a fundamental change in the way fields are encoded but it seems like that never happened. I'm happy to try to steward this because I think it will fix a number of problems that people using the Requests library that builds upon urllib3 have ran into. Let me know what I need to do.

Closes #303

James Elmendorf and others added 5 commits May 3, 2016 16:40
…e HTML5 working draft, and made this the default.

Support for RFC 2231-style encoding remains. Which style encoding to use
can be selected by setting `filename_encoding_style` when creating a
`RequestField` (by `__init__` or by `from_tuples`).
Made sure the `format_header_param_*` methods returned unicode strings
on all Python versions on all code paths.

Added a few more comments, including "Must be unicode." to
`RequestField.__init__`. This was to match the requirement of
`from_tuples` which requires field names and file names to be unicode.
@sethmlarson
Copy link
Member

I know @sigmavirus24 had some thoughts on the previous PR so just CC-ing in case you'd like to weigh in on this change.

@sethmlarson sethmlarson self-requested a review December 3, 2018 20:05
@Robbt
Copy link
Contributor Author

Robbt commented Dec 3, 2018

I think that the tests need to be updated to address how this works but I think it would make sense for someone else to review this so I don't just rewrite the test to match the current behaviour. For the first failure I do think that it is checking to see if the field is in the old RFC2231 format that this removes.

The second one I'm less clear on.

@theacodes
Copy link
Member

Hi folks! this PR has been open for a while. Is this still something we should do? @sigmavirus24, you were identified as the subject-matter expert, do you have time to review or should I?

@Robbt
Copy link
Contributor Author

Robbt commented Jan 22, 2019

I'd be happy to refine it further with review. Personally I found that RFC2231 only was resulting in errors in my project and a number of other projects that rely upon requests but I'm also no subject expert. I just had a brief deep dive while trying to figure out an issue with a module I was developing for my project. I can take a look at modifying the tests because I think they just aren't written to test the new expected values but I'd like someone else to chime in for sure.

@theacodes
Copy link
Member

theacodes commented Jan 22, 2019 via email

@sigmavirus24
Copy link
Contributor

Yeah, I know a bit too much about 2231 but I'm far more grumpy that predominantly US-developed frameworks have such terrible support for accepting 2231 encoded filenames given the spec is well over a decade old at this point. People just assume everything is ascii and the HTML5 spec has about as much adoption server-side as 2231 did last I checked, so we're trading one poorly supported thing for another. 🤷‍♂️

@sethmlarson
Copy link
Member

@sigmavirus24 I might not be seeing this correctly but does this only impact requests made with encode_multipart=True? If this is not true disregard below, I haven't given this more than 2 minutes of investigation.

What are your thoughts on making it configurable in a similar way to our multipart boundary can be configured as a part of RequestMethods.request_encode_body and default to RFC2231? Something like multipart_field_format='rfc2231' or 'html5'?

@Robbt
Copy link
Contributor Author

Robbt commented Jan 22, 2019

I think that making it an option that can be configured is probably the best of both worlds. I also understand that we might not want to switch the default behaviour to HTML5 right off the bat and could just add the option to use HTML5. Unfortunately I didn't do the work on this PR in the first place I just revived it from a previously abandoned PR because I needed this functionality to get requests to work for me and I'd like to do what I can to make this an option for other people who I saw were running into this issue, but I'm not an expert on the requests framework.

@sigmavirus24
Copy link
Contributor

I might not be seeing this correctly but does this only impact requests made with encode_multipart=True?

You are correct. Specifically, multipart/form-data uses the same header encoding strategy as email for each part. And the proper way to do this for email clients is to use RFC 2231.

What are your thoughts on making it configurable in a similar way to our multipart boundary can be configured as a part of RequestMethods.request_encode_body and default to RFC2231? Something like multipart_field_format='rfc2231' or 'html5'?

To be clear, my concern isn't backwards compatibility, just the lack of certainty that this makes things any better for people. Exposing this as an option would require extra work in requests to utilize that or to pass the configuration options onto users. [1] As a result, if this is optional and not default, Requests users won't get it for free

[1] Unfortunately, our multipart API is already atrocious so adding more to it for this would be a non-starter for me.

@Robbt
Copy link
Contributor Author

Robbt commented Jan 22, 2019

I haven't studied the RFC's in depth but the discussion in #303 seems valid and it looks like this was first attempted in #304 - when RFC 7578 was still a draft but as of 2015 it has now rendered RFC2388 obsolete. If I'm not mistaken the usage of RFC2231 encoding was done in accordance with RFC2388.

If we aren't worried about backwards compatibility then it would seem like going with the new standard method of multi-part encoding of filenames would make the most sense. From my reasoning it is unlikely that anyone will be adding support for a 22+ year old standard that has been rendered obsolete whereas I know that a number of people from various bug reports have had issues with the way UTF8 filenames are encoded.

I could understand if there were counter-examples where a server would only work with RFC2231 and not support the newer standard but moving forward I think it would make sense to support the newest standard. Like I said before I can't totally vouch that this code does everything it needs to do at this point but it did solve the problem I ran into and so I'm willing to help try to hash it out so others can benefit from this change. I have contributed far less and have far less skin in the game than anyone else here at this point so my interest is just trying to figure out the best technical approach.

If this isn't going to be merged I think that I'll have to revisit the curl script that someone wrote to accomplish the task I was trying to do but I'd really rather accomplish the whole task in python3 as the code is already written but I don't want to distribute a module that requires someone to use my non-standard repository for basic functionality.

@theacodes
Copy link
Member

theacodes commented Mar 11, 2019

Okay, I'm dragging this back up as these unicode issues seem to still be popping up for users.

Broadly speaking, I'm in favor of us matching browser behavior, especially Firefox. If there is a server out there that can't read files uploaded using multipart/form-data from a web browser it is broken beyond our help, as multipart/form-data literally exists for the benefit of browsers.

@sigmavirus24 can you help me understand your reservations beyond just not being sure that the HTML5 encoding scheme won't fix all problems? Do we have counter-examples where using it would be problematic? (e.g., if you know already that Flask or Django outright chokes on the HTML5 format, that would be a great datapoint for us)

Otherwise, I'd like to move forward with a slightly modified version of this:

  1. We default to html5 parsing.
  2. We allow changing the parsing scheme in code.
  3. We allow changing the default back to RFC2231 using an environment variable, such as URLLIB3_USE_RFC2231_BY_DEFAULT for the benefit of Requests users that might run into issues with the new default.

Thoughts? @sethmlarson?

@sethmlarson
Copy link
Member

I'm in favor of doing what browsers find to be best. Wonder what curl does, probably propagates whatever you give it? I'm +1 to what you've mentioned.

@theacodes
Copy link
Member

Great, let's wait to hear from @sigmavirus24. If he's on board, I can take on bringing this PR into the proposed state.

@theacodes
Copy link
Member

Okay, here's a comparison of httpie (which uses Requests) and curl.

httpie

Here's the command:

http -v -f POST https://httpbin.org/post αλήθεια@αλήθεια.txt

As expected, it uses RFC2231:

http -v -f POST https://httpbin.org/post αλήθεια@αλήθεια.txt
POST /post HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Content-Length: 263
Content-Type: multipart/form-data; boundary=095b687d6199b2436196354e281e2e8e
Host: httpbin.org
User-Agent: HTTPie/1.0.2

--095b687d6199b2436196354e281e2e8e
Content-Disposition: form-data; name*=utf-8''%CE%B1%CE%BB%CE%AE%CE%B8%CE%B5%CE%B9%CE%B1; filename*=utf-8''%CE%B1%CE%BB%CE%AE%CE%B8%CE%B5%CE%B9%CE%B1.txt
Content-Type: text/plain

Meep

--095b687d6199b2436196354e281e2e8e--

HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Encoding: gzip
Content-Length: 304
Content-Type: application/json
Date: Mon, 11 Mar 2019 02:39:57 GMT
Server: nginx

{
    "args": {},
    "data": "",
    "files": {
        "αλήθεια": "Meep\n"
    },
    "form": {},
    "headers": {
        "Accept": "*/*",
        "Accept-Encoding": "gzip, deflate",
        "Content-Length": "263",
        "Content-Type": "multipart/form-data; boundary=095b687d6199b2436196354e281e2e8e",
        "Host": "httpbin.org",
        "User-Agent": "HTTPie/1.0.2"
    },
    "json": null,
    "origin": "104.232.115.83, 104.232.115.83",
    "url": "https://httpbin.org/post"
}

Relevant line:

Content-Disposition: form-data; name*=utf-8''%CE%B1%CE%BB%CE%AE%CE%B8%CE%B5%CE%B9%CE%B1; filename*=utf-8''%CE%B1%CE%BB%CE%AE%CE%B8%CE%B5%CE%B9%CE%B1.txt

curl

Curl apparently uses HTTP encoding.

Here's the command:

curl --trace - -F "αλήθεια=@αλήθεια.txt" https://httpbin.org/post

And the output:

curl --trace - -F "αλήθεια=@αλήθεια.txt" https://httpbin.org/post
== Info:   Trying 3.85.154.144...
== Info: Connected to httpbin.org (3.85.154.144) port 443 (#0)
== Info: found 173 certificates in /etc/ssl/certs/ca-certificates.crt
== Info: found 694 certificates in /etc/ssl/certs
== Info: ALPN, offering http/1.1
== Info: SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256
== Info:         server certificate verification OK
== Info:         server certificate status verification SKIPPED
== Info:         common name: httpbin.org (matched)
== Info:         server certificate expiration date OK
== Info:         server certificate activation date OK
== Info:         certificate public key: RSA
== Info:         certificate version: #3
== Info:         subject: CN=httpbin.org
== Info:         start date: Sun, 17 Feb 2019 00:00:00 GMT
== Info:         expire date: Tue, 17 Mar 2020 12:00:00 GMT
== Info:         issuer: C=US,O=Amazon,OU=Server CA 1B,CN=Amazon
== Info:         compression: NULL
== Info: ALPN, server did not agree to a protocol
=> Send header, 209 bytes (0xd1)
0000: 50 4f 53 54 20 2f 70 6f 73 74 20 48 54 54 50 2f POST /post HTTP/
0010: 31 2e 31 0d 0a 48 6f 73 74 3a 20 68 74 74 70 62 1.1..Host: httpb
0020: 69 6e 2e 6f 72 67 0d 0a 55 73 65 72 2d 41 67 65 in.org..User-Age
0030: 6e 74 3a 20 63 75 72 6c 2f 37 2e 34 37 2e 30 0d nt: curl/7.47.0.
0040: 0a 41 63 63 65 70 74 3a 20 2a 2f 2a 0d 0a 43 6f .Accept: */*..Co
0050: 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 32 31 ntent-Length: 21
0060: 31 0d 0a 45 78 70 65 63 74 3a 20 31 30 30 2d 63 1..Expect: 100-c
0070: 6f 6e 74 69 6e 75 65 0d 0a 43 6f 6e 74 65 6e 74 ontinue..Content
0080: 2d 54 79 70 65 3a 20 6d 75 6c 74 69 70 61 72 74 -Type: multipart
0090: 2f 66 6f 72 6d 2d 64 61 74 61 3b 20 62 6f 75 6e /form-data; boun
00a0: 64 61 72 79 3d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d dary=-----------
00b0: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 34 62 32 -------------4b2
00c0: 38 37 64 37 30 38 62 66 63 62 34 36 33 0d 0a 0d 87d708bfcb463...
00d0: 0a                                              .
<= Recv header, 23 bytes (0x17)
0000: 48 54 54 50 2f 31 2e 31 20 31 30 30 20 43 6f 6e HTTP/1.1 100 Con
0010: 74 69 6e 75 65 0d 0a                            tinue..
=> Send data, 158 bytes (0x9e)
0000: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ----------------
0010: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 34 62 32 38 37 64 ----------4b287d
0020: 37 30 38 62 66 63 62 34 36 33 0d 0a 43 6f 6e 74 708bfcb463..Cont
0030: 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e 3a ent-Disposition:
0040: 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d 65  form-data; name
0050: 3d 22 ce b1 ce bb ce ae ce b8 ce b5 ce b9 ce b1 ="..............
0060: 22 3b 20 66 69 6c 65 6e 61 6d 65 3d 22 ce b1 ce "; filename="...
0070: bb ce ae ce b8 ce b5 ce b9 ce b1 2e 74 78 74 22 ............txt"
0080: 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 ..Content-Type:
0090: 74 65 78 74 2f 70 6c 61 69 6e 0d 0a 0d 0a       text/plain....
=> Send data, 5 bytes (0x5)
0000: 4d 65 65 70 0a                                  Meep.
=> Send data, 48 bytes (0x30)
0000: 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ..--------------
0010: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 34 62 32 38 ------------4b28
0020: 37 64 37 30 38 62 66 63 62 34 36 33 2d 2d 0d 0a 7d708bfcb463--..
<= Recv header, 17 bytes (0x11)
0000: 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d HTTP/1.1 200 OK.
0010: 0a                                              .
<= Recv header, 40 bytes (0x28)
0000: 41 63 63 65 73 73 2d 43 6f 6e 74 72 6f 6c 2d 41 Access-Control-A
0010: 6c 6c 6f 77 2d 43 72 65 64 65 6e 74 69 61 6c 73 llow-Credentials
0020: 3a 20 74 72 75 65 0d 0a                         : true..
<= Recv header, 32 bytes (0x20)
0000: 41 63 63 65 73 73 2d 43 6f 6e 74 72 6f 6c 2d 41 Access-Control-A
0010: 6c 6c 6f 77 2d 4f 72 69 67 69 6e 3a 20 2a 0d 0a llow-Origin: *..
<= Recv header, 32 bytes (0x20)
0000: 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 61 70 Content-Type: ap
0010: 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 6f 6e 0d 0a plication/json..
<= Recv header, 37 bytes (0x25)
0000: 44 61 74 65 3a 20 4d 6f 6e 2c 20 31 31 20 4d 61 Date: Mon, 11 Ma
0010: 72 20 32 30 31 39 20 30 32 3a 34 31 3a 35 32 20 r 2019 02:41:52
0020: 47 4d 54 0d 0a                                  GMT..
<= Recv header, 15 bytes (0xf)
0000: 53 65 72 76 65 72 3a 20 6e 67 69 6e 78 0d 0a    Server: nginx..
<= Recv header, 21 bytes (0x15)
0000: 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 Content-Length:
0010: 34 35 35 0d 0a                                  455..
<= Recv header, 24 bytes (0x18)
0000: 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 6b 65 65 70 Connection: keep
0010: 2d 61 6c 69 76 65 0d 0a                         -alive..
<= Recv header, 2 bytes (0x2)
0000: 0d 0a                                           ..
<= Recv data, 455 bytes (0x1c7)
0000: 7b 0a 20 20 22 61 72 67 73 22 3a 20 7b 7d 2c 20 {.  "args": {},
0010: 0a 20 20 22 64 61 74 61 22 3a 20 22 22 2c 20 0a .  "data": "", .
0020: 20 20 22 66 69 6c 65 73 22 3a 20 7b 0a 20 20 20   "files": {.
0030: 20 22 5c 75 30 33 62 31 5c 75 30 33 62 62 5c 75  "\u03b1\u03bb\u
0040: 30 33 61 65 5c 75 30 33 62 38 5c 75 30 33 62 35 03ae\u03b8\u03b5
0050: 5c 75 30 33 62 39 5c 75 30 33 62 31 22 3a 20 22 \u03b9\u03b1": "
0060: 4d 65 65 70 5c 6e 22 0a 20 20 7d 2c 20 0a 20 20 Meep\n".  }, .
0070: 22 66 6f 72 6d 22 3a 20 7b 7d 2c 20 0a 20 20 22 "form": {}, .  "
0080: 68 65 61 64 65 72 73 22 3a 20 7b 0a 20 20 20 20 headers": {.
0090: 22 41 63 63 65 70 74 22 3a 20 22 2a 2f 2a 22 2c "Accept": "*/*",
00a0: 20 0a 20 20 20 20 22 43 6f 6e 74 65 6e 74 2d 4c  .    "Content-L
00b0: 65 6e 67 74 68 22 3a 20 22 32 31 31 22 2c 20 0a ength": "211", .
00c0: 20 20 20 20 22 43 6f 6e 74 65 6e 74 2d 54 79 70     "Content-Typ
00d0: 65 22 3a 20 22 6d 75 6c 74 69 70 61 72 74 2f 66 e": "multipart/f
00e0: 6f 72 6d 2d 64 61 74 61 3b 20 62 6f 75 6e 64 61 orm-data; bounda
00f0: 72 79 3d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ry=-------------
0100: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 34 62 32 38 37 -----------4b287
0110: 64 37 30 38 62 66 63 62 34 36 33 22 2c 20 0a 20 d708bfcb463", .
0120: 20 20 20 22 48 6f 73 74 22 3a 20 22 68 74 74 70    "Host": "http
0130: 62 69 6e 2e 6f 72 67 22 2c 20 0a 20 20 20 20 22 bin.org", .    "
0140: 55 73 65 72 2d 41 67 65 6e 74 22 3a 20 22 63 75 User-Agent": "cu
0150: 72 6c 2f 37 2e 34 37 2e 30 22 0a 20 20 7d 2c 20 rl/7.47.0".  },
0160: 0a 20 20 22 6a 73 6f 6e 22 3a 20 6e 75 6c 6c 2c .  "json": null,
0170: 20 0a 20 20 22 6f 72 69 67 69 6e 22 3a 20 22 31  .  "origin": "1
0180: 30 34 2e 32 33 32 2e 31 31 35 2e 38 33 2c 20 31 04.232.115.83, 1
0190: 30 34 2e 32 33 32 2e 31 31 35 2e 38 33 22 2c 20 04.232.115.83",
01a0: 0a 20 20 22 75 72 6c 22 3a 20 22 68 74 74 70 73 .  "url": "https
01b0: 3a 2f 2f 68 74 74 70 62 69 6e 2e 6f 72 67 2f 70 ://httpbin.org/p
01c0: 6f 73 74 22 0a 7d 0a                            ost".}.
{
  "args": {},
  "data": "",
  "files": {
    "\u03b1\u03bb\u03ae\u03b8\u03b5\u03b9\u03b1": "Meep\n"
  },
  "form": {},
  "headers": {
    "Accept": "*/*",
    "Content-Length": "211",
    "Content-Type": "multipart/form-data; boundary=------------------------4b287d708bfcb463",
    "Host": "httpbin.org",
    "User-Agent": "curl/7.47.0"
  },
  "json": null,
  "origin": "104.232.115.83, 104.232.115.83",
  "url": "https://httpbin.org/post"
}
== Info: Connection #0 to host httpbin.org left intact

Relevant section:

=> Send data, 158 bytes (0x9e)
0000: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ----------------
0010: 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 34 62 32 38 37 64 ----------4b287d
0020: 37 30 38 62 66 63 62 34 36 33 0d 0a 43 6f 6e 74 708bfcb463..Cont
0030: 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e 3a ent-Disposition:
0040: 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d 65  form-data; name
0050: 3d 22 ce b1 ce bb ce ae ce b8 ce b5 ce b9 ce b1 ="..............
0060: 22 3b 20 66 69 6c 65 6e 61 6d 65 3d 22 ce b1 ce "; filename="...
0070: bb ce ae ce b8 ce b5 ce b9 ce b1 2e 74 78 74 22 ............txt"
0080: 0d 0a 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 ..Content-Type:
0090: 74 65 78 74 2f 70 6c 61 69 6e 0d 0a 0d 0a       text/plain....

Where 0x6e marks the start of the string αλήθεια

@sigmavirus24
Copy link
Contributor

(e.g., if you know already that Flask or Django outright chokes on the HTML5 format, that would be a great datapoint for us)

I don't know that already. I'm just familiar with the ASCII-centric view of most WSGI/Rack/etc based frameworks and the fact that they flat out don't keep up with decades old web standards, let alone those released a handful of years ago.

I'm not certain this will break things that aren't already broken either so :Shrug:

As for using environment variables to control behaviour, that way lies dragons from my experiences in Requests. It's can be a great control rod for "power" users but I'm not sure this is a control rod we need right now. Besides, it's easier to say no today and yes tomorrow.

@theacodes
Copy link
Member

theacodes commented Mar 11, 2019 via email

@sethmlarson
Copy link
Member

sethmlarson commented Mar 11, 2019

@theacodes I might be wrong but sigma may have wanted that comment to apply to the environment variable controlling behavior instead of the general idea of using HTML5 encoding.

Correct me if I'm wrong, that's how I understood your comment @sigmavirus24

@theacodes
Copy link
Member

theacodes commented Mar 11, 2019 via email

@shazow
Copy link
Member

shazow commented Mar 11, 2019

+1 to making the encoding pluggable (maybe we can provide an encoder/decoder to override the default; we can even provide the two encoder options out of the box and consumers can choose whichever downstream), but -0.5 to making it controlled by an environment variable.

I feel env-based internal knobs in libraries produce unexpected precedence bugs, and it's hard to draw a line of what should be in an env var vs not, and also makes it hard to deprecate noticeably. IMO env knobs make more sense in end-user applications, rather than libraries.

FWIW, "breaking things that are already broken" at least puts us on the right side of history in the long run vs perpetuating broken behaviour. :) I generally vote for some acute short term pain in favour of progress. It's also usually the less burdensome path for maintainers.

tox.ini Show resolved Hide resolved
@theacodes
Copy link
Member

Confirmed that this works as expected with Flask.

Urllib3 request:

import urllib3
http = urllib3.PoolManager()
http.request("POST", "http://localhost:5000", fields={"αλήθεια": ("αλήθεια.txt", "Meep", "text/plain")})

the request headers read:

ImmutableMultiDict([('αλήθεια', <FileStorage: 'αλήθεια.txt' ('text/plain')>)])

@theacodes
Copy link
Member

Giving up for today: I can't reproduce the CI's failure locally, which makes debugging this hard.

@urllib3 urllib3 deleted a comment from codecov-io Mar 12, 2019
@codecov-io
Copy link

codecov-io commented Mar 12, 2019

Codecov Report

Merging #1492 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1492      +/-   ##
=========================================
- Coverage   99.94%   99.9%   -0.05%     
=========================================
  Files          22      22              
  Lines        1826    2075     +249     
=========================================
+ Hits         1825    2073     +248     
- Misses          1       2       +1
Impacted Files Coverage Δ
src/urllib3/fields.py 100% <100%> (ø) ⬆️
src/urllib3/connection.py 99.01% <0%> (-0.99%) ⬇️
src/urllib3/util/timeout.py 100% <0%> (ø) ⬆️
src/urllib3/response.py 100% <0%> (ø) ⬆️
src/urllib3/connectionpool.py 100% <0%> (ø) ⬆️
src/urllib3/util/url.py 100% <0%> (+1.13%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a0957e...ea9003e. Read the comment docs.

@theacodes
Copy link
Member

Alright, @sethmlarson this is ready for review. Most importantly, I want to make sure you're comfortable with the new parameter name header_encoder.

@sigmavirus24
Copy link
Contributor

Personally +1 on header_encoder. Also @sethmlarson understood my objection to be the environment variable appropriately. This all sounds good to me.

Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some issues and I have some questions as well.

dummyserver/handlers.py Outdated Show resolved Hide resolved
src/urllib3/fields.py Outdated Show resolved Hide resolved
src/urllib3/fields.py Show resolved Hide resolved
src/urllib3/fields.py Show resolved Hide resolved
src/urllib3/fields.py Outdated Show resolved Hide resolved
src/urllib3/fields.py Outdated Show resolved Hide resolved
src/urllib3/fields.py Show resolved Hide resolved
src/urllib3/fields.py Outdated Show resolved Hide resolved
@sethmlarson
Copy link
Member

Also thoughts on adding a section to the docs about how to switch to previous or custom behavior? Basically documenting the pluggability

Copy link
Member

@theacodes theacodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sethmlarson thanks for the thorough review (apparently I'm quite rusty!) should be much better now.

src/urllib3/fields.py Show resolved Hide resolved
src/urllib3/fields.py Outdated Show resolved Hide resolved
src/urllib3/fields.py Outdated Show resolved Hide resolved
Copy link
Member

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one more test case for control characters and fixed the docs failure on Travis, you can take a look at my commits @theacodes.

Assuming I didn't break anything with my latest commits this looks good to me! Thanks for picking this up and running with it. :)

@theacodes
Copy link
Member

Thanks, @sethmlarson! You can merge when you're ready. :)

@sethmlarson sethmlarson merged commit 46331f9 into urllib3:master Mar 23, 2019
@theacodes
Copy link
Member

theacodes commented Mar 23, 2019 via email

@sethmlarson
Copy link
Member

Woo! 🎉 Thank you @Spryttan for opening the original PR and @Robbt for rebasing. This is a great change. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider HTML 5 draft for multipart/form-data
6 participants