Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StandardMultipartHttpServletRequest cannot decode multipart Content-Disposition header encoded by FormHttpMessageConverter [SPR-15205] #19769

Closed
spring-projects-issues opened this issue Jan 30, 2017 · 16 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

spring-projects-issues commented Jan 30, 2017

Miroslav Holubec opened SPR-15205 and commented

There is wrong behavior when encoding/decoding Content-Disposition header.
When I do setMultipartCharset() on AllEncompassingFormHttpMessageConverter, it emits header like this:

Content-Disposition: form-data; name="uploadedFile"; filename="=?UTF-8?Q?Declara=C3=A7=C3=A3o.pdf?="

StandardMultipartHttpServletRequest cannot decode such a name, as it expects that "filename" is followed by asterisk.

I'm a bit lost in RFCs, as not sure if RFC 6266 should be applied here (means asterisk required) or not.


Affects: 4.3.6

Issue Links:

Referenced from: commits 390bb87, bb684ce, 75117f4, f89511e, 3009e29

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

As far as I see, StandardMultipartHttpServletRequest checks for both filename= and filename*=. Could you please double-check why it isn't parsing correctly in your case?

@spring-projects-issues
Copy link
Collaborator Author

Miroslav Holubec commented

StandardMultipartHttpServletRequest does extractFilenameWithCharset(String) only in case that extractFilename(String) returns null (means no filename= was present). Also I'm not sure if extractFilenameWithCharset(String) can parse this encoding (RFC-6266 seems to use different encoding without question marks).

@spring-projects-issues
Copy link
Collaborator Author

Miroslav Holubec commented

Finally found required RFCs. So it seems to be that FormHttpMessageConverter uses RFC-2047 encoding, which is unsupported by StandardMultipartHttpServletRequest (that supports only RFC-6266).

@spring-projects-issues
Copy link
Collaborator Author

Miroslav Holubec commented

possible workaround using commons-fileupload

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Jan 30, 2017

Rossen Stoyanchev commented

In 4.1.1 #16724 added support for writing encoded file names to FormHttpMessageConverter based on RFC-2047 which RFC-6266 says should not be used vs RFC-5987 which defines encoding for header field parameters specifically in HTTP. The second issue is that "filename*" should be used instead of "filename" as per RFC-6266, section 4.3.

In 4.2.1 #17904 added support for "filename*" in StandardMultipartHttpServletRequest based on RFC 2231 which is compatible with RFC-5987 but defining a subset of it for use in HTTP. We should however make an improvement to check for "filename*" first as recommended in RFC-6266 section 4.3.

In 4.3 #19115 added support for RFC-5987 encoding to HttpHeaders. So it looks like FormHttpMessageConverter could now be updated to use the setContentDisposition(String, String, charset) and that should produce a "filename*" attribute with RFC-5987 encoding. Hopefully commons-fileupload supports that, we'll have to check.

In 5.0 #18979 extracted content-disposition from HttpHeaders handling into a separate ContentDisposition type also moving along RFC-5987 related encoding and further adding support for decoding. Currently ContentDisposition refers to RFC-2183 (updated by RFC 2231 and hence compatible with RFC-5987) but should really be based on RFC-6266. Also RFC-5987 related encoding and decoding should probably move back into HttpHeaders since it applies more generally to any header parameters.

In 5.0 at least we can also have StandardMultipartHttpServletRequest delegate to ContentDisposition for the parsing.

/cc Brian Clozel, Sébastien Deleuze

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Jan 30, 2017

Rossen Stoyanchev commented

This was also recently discussed under #19072. Specifically this from RFC-7578 section 4.3:

In most multipart types, the MIME header fields in each part are
restricted to US-ASCII; for compatibility with those systems, file
names normally visible to users MAY be encoded using the percent-
encoding method in Section 2, following how a "file:" URI
[URI-SCHEME] might be encoded.

NOTE: The encoding method described in [RFC5987], which would add a
"filename*" parameter to the Content-Disposition header field, MUST
NOT be used.

So the most recent advice seems to be that you're supposed to encode file names like URIs but on the server side at least we can't avoid processing "filename*" as per #10655 it seems to be. And also continue to support it in FormHttpMessageConverter through the multipartCharset property.

Sébastien Deleuze, Brian Clozel what's your take on all this?

@spring-projects-issues
Copy link
Collaborator Author

Miroslav Holubec commented

Hi Rossen, I think you are wrong about saying:

...based on RFC-2047 which RFC-6266 says should not be used. RFC-6266 states clearly:
This document does not apply to Content-Disposition header fields appearing in payload bodies transmitted over HTTP, such as when using the media type "multipart/form-data" ([RFC2388])..
Anyway, receiving side should ideally support all possible variants and by my recent opinion, RFC-2047 on sender side should be fine, as it follows MIME standard.

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

Yes you're right that RFC-6266 explicitly excludes the multipart/form-data cases. The RFC-2388 it refers to is obsoleted by RFC-7578 which explains the quote from it, in my last comment.

On the receiving side we should support all options indeed. The question is about the sending side, i.e. in FormHttpMessageConverter and also in ContentDisposition / HttpHeaders. The former is RFC-2047, the latter RFC-5987. Or maybe the key is for ContentDisposition to separate out the case of multipart/form-data from other content-disposition types.

We'll have to experiment a bit with browser support too. Here is one article from Mozilla that seems to favor RFC-5987.

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

Looking around at a few Java libraries and servers:

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

A quick experiment with browsers. Firefox and Chrome encode the filename based on the charset of the form (i.e. accept-charset form attribute). This is consistent with what the HTML5 spec says. Apache Commons FileUpload however does not decode that correctly.

@spring-projects-issues
Copy link
Collaborator Author

Miroslav Holubec commented

Yes, you are right. RFC-7578 seems to be valid document for implementation here. Unfortunately URI encoding is currently not widely supported on receiving sides. Because of that I would vote to support RFC-2047 encoding on sending site, at least until URI encoding support is implemented in commons-fileupload.

About receiving side, as far as I understood we should support RFC-2047, percent encoding and directly encoded header (in non ASCII charset). Support for filename*= can be dropped as RFC-7578 clearly forbids it.

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

Let's turn this into a 5.0 topic then, possibly picking up a Commons FileUpload update as well.

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

I've made one change already in FormHttpMessageConverter for the client side to encode part headers directly to the configured charset (UTF-8 by default), also setting the charset of the "Content-Type" header so the server knows how to decode, which is what Apache Commons FileUpload seems to do. The RFC 2047 filename encoding is still there but used only if you set the separate "multipartCharset" property.

I'm moving the ticket to RC2 to complete the server side work. My plan for the server side is to support direct encoding (based on the charset of the "Content-Type" header). I would also drop the RFC 5987 encoding that we currently support since RFC-7578 forbids it. I hesitate to support RFC 2047 mainly because it requires javax.mail as a dependency, which is less than ideal for this reason alone, and because I'm hoping direct encoding goes far enough to meet the need for encoded filenames. As for URI encoding I'm still not sure how would the server detect that is used, would it check for a scheme prefix for example?

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

StandardMultipartHttpServletRequest now supports both RFC 5987 and RFC 2047 filename encoding although the latter works only if javax.mail is present.

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

rstoyanchev, it just occurred to me that Servlet 3.1 has a getSubmittedFileName() method on Part: Since Spring Framework 5.0 is Servlet 3.1+ anyway, shouldn't we simply be delegating to that method? Or at least have an option to delegate to that?

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

Good point.

The Javadoc (and spec) are scant on information about what the method should do.

The Tomcat implementation seems to respect RFC-6266 which we already discussed above shouldn't be used. The Jetty implementation doesn't support any decoding, just unquotes.

We would probably have to have an option to delegate to that, if required, to call it as an alternative to what we have now which supports RFC 5987 and 2047.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

2 participants