Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FormHttpMessageConverter should use configured charset (by default UTF-8) for "text-plain" MIME part conversion as well [SPR-14338] #18910

Closed
spring-projects-issues opened this issue Jun 7, 2016 · 7 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

spring-projects-issues commented Jun 7, 2016

Rafał Garbowski opened SPR-14338 and commented

Problem:
By default "FormHttpMessageConverter" encodes "text/plain" parts of "MULTIPART_FORM_DATA" message in "latin-1" charset, so eastern european characters are not supported.

Here is example, where I want to send value "łęąć".
How it works now:

    --CJRdVETZo8EZUnHRNhUAkJxwfgiaRlH
    Content-Disposition: form-data; name="param"
    Content-Type: text/plain;charset=ISO-8859-1
    Content-Length: 4

    ????
    --CJRdVETZo8EZUnHRNhUAkJxwfgiaRlH--

And how it should work:

    --QHVM39WsZleGp9jW4RYYKzO7OcqwMpQZRpaQcS7
    Content-Disposition: form-data; name="param"
    Content-Type: text/plain;charset=UTF-8
    Content-Length: 8

    łęąć
    --QHVM39WsZleGp9jW4RYYKzO7OcqwMpQZRpaQcS7--

Now workaround is defining all default converters once again in some configuration bean, setting UTF-8 to "StringHttpMessageConverter" and then putting back as argument of method "setPartConverters".

formConverter.setPartConverters(Arrays.asList(new ByteArrayHttpMessageConverter(), new StringHttpMessageConverter(Charsets.UTF_8), new ResourceHttpMessageConverter()));

Why not use UTF-8 by default to encode "plain/text" MIME parts? Here is link to "pull-request" of this solution on GitHub:
3cdda3a

In multipart/form-data specs I didn't find a word about preferable encoding.
In our case "FormHttpMessageConverter" is forcing "ISO-8859-1" charset. It isn't the same as not defining it at all and allowing recipient to use default - see this answer on stackoverflow. So FIX presented on Github shouldn't crash old systems. Is there any application that accepts only "latin-1" encoded requests? I don't think so.


Reference URL: #1072

Issue Links:

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

Brian Clozel, Rossen Stoyanchev, since we're not likely to change the default encoding there, what about making a custom encoding more easily configurable? We could still aim for that for 4.3 GA.

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

Note that on the server-side the MVC Java config has a convenient method to extend the list of message converters without completely replacing the default ones, see WebMvcConfigurer#extendMessageConverters. The MVC namespace also allows you to provide converters with register-defaults=true.

@spring-projects-issues
Copy link
Collaborator Author

Rafał Garbowski commented

The case is client-side config e.g. org.springframework.web.client.RestTemplate.

Maybe there is possibility to make partConverters protected and add some out-of-the-box support by creating new class CharsetAwareFormHttpMessageConverter?

public class CharsetAwareFormHttpMessageConverter extends AllEncompassingFormHttpMessageConverter {

    public CharsetAwareFormHttpMessageConverter(){
        this(Charsets.UTF_8);
    }

    public CharsetAwareFormHttpMessageConverter(Charset charset) {
        super();

        for (HttpMessageConverter<?> converter :
                super.partConverters) {
            if (StringHttpMessageConverter.class.isAssignableFrom(converter.getClass())) {
                StringHttpMessageConverter.class.cast(converter).setDefaultCharset(charset);
            }
        }
    }
}

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

There are a few charset configuration options on FormHttpMessageConverter already. They should apply to multipart content as well, and be more easily configurable through RestTemplate. I'm looking at this for inclusion in 4.3 GA still.

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Jun 8, 2016

Juergen Hoeller commented

As far as I can see, this is actually inconsistent at the moment: FormHttpMessageConverter's "charset" property - defaulting to UTF-8 - applies to regular text bodies but not to multipart text bodies. I see no reason why multipart text bodies have to use ISO-8859-1 by default when regular text bodies use UTF-8, and in particular why an explicitly specified charset wouldn't apply to multipart content as well.

So as of 4.3 GA, we're applying a FormHttpMessageConverter-configured charset to all AbstractHttpMessageConverter-compatible part converters as well. This is also in line with the changes behind #18209, also new in 4.3. As a consequence, the default for multipart text bodies is FormHttpMessageConverter's UTF-8 now.

@spring-projects-issues
Copy link
Collaborator Author

Juergen Hoeller commented

Rafał Garbowski, since this is available in the latest 4.3.0.BUILD-SNAPSHOT, it would be great if you could give a try tomorrow... ahead of our GA release on Friday.

@spring-projects-issues
Copy link
Collaborator Author

Rafał Garbowski commented

Unfortunately I cannot make any integration test, because of versions conflict. However I ran unit test from my original solution and it passed, so good job :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

2 participants