Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reactive multipart request support [SPR-14546] #19114

Closed
spring-projects-issues opened this issue Jul 30, 2016 · 10 comments
Closed

Reactive multipart request support [SPR-14546] #19114

spring-projects-issues opened this issue Jul 30, 2016 · 10 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

spring-projects-issues commented Jul 30, 2016

Rossen Stoyanchev opened SPR-14546 and commented

This needs to be investigated. I don't know if there is anything in the Servlet API for this that's non-blocking.


Affects: 5.0 M1

Issue Links:

0 votes, 7 watchers

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

I had a look to the current support for non-blocking multipart processing, there are some implementations available.

Grizzly

  • Non blocking Multipart processing available
  • Dependency on grizzly-framework and grizzly-http-server

NIO Multipart

  • Very recent, 4 GitHub stars
  • Pretty decent documentation
  • Seems based on InputStream and OutputStream so we need to double-check we can build a Reactive implementation on top of that (see [NioMultipartParser](https://github.com/synchronoss/nio-multipart/blob/master/nio-multipart-parser/src/main/java/org/synchronoss/cloud/nio/multipart/NioMultipartParser.java))
  • Dependency on slf4j-api and [nio-stream-storage](https://github.com/synchronoss/nio-stream-storage)

I had also a look to the multipart processing support in the engines we use:

Various remarks:

  • Existing Spring Web multipart support (based on Apache Commons FileUpload or Servlet 3.0 multipart API) is not reusable because based on Servlet API + blocking implementations
  • Low level support like Netty one seems not necessarily usable (because for example rxNetty uses higher level abstraction)
  • Native support like Undertow one may be not pluggable on top of our own Reactive compliant abstraction
  • Like for JSON async processing, this feature is something where we are pretty early in the game
  • I think we would like to avoid as much as possible to maintain our own multipart parser implementation
  • Using a non-blocking library like NIO Multipart could allow us to plug multipart support on top of our current backpressure support

As a consequence, I think as a first step I would like to experiment with a prototype using NIO Multipart that will allow us to discuss and iterate to find the right API for exposing such feature. Maybe something like a Flux<Part> getParts() on a multipart aware request, with Part allowing to deal with header names and values, as well as allowing to get the content as Flux<DataBuffer> or turning it to a file easily in a non blocking way.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

I am starting working on a draft PR in order to implement Reactive multipart support using NIO Multipart library underneath.

@spring-projects-issues
Copy link
Collaborator Author

Rossen Stoyanchev commented

Sébastien Deleuze, multipart requests and form processing are closely related, hence some comments.

HTML5 & RFC 7578

The Servlet API has a particular way of representing this so let's consider a protocol-level perspective first.

HTML5 section 4.10.22 defines form submissions. Form with HTTP GET appends "application/x-www-form-urlencoded" form data to the URI query. Form with HTTP POST submits a body with content type "application/x-www-form-urlencoded" or "multiform/data". For the actual encoding of "multipartform/data" the HTML5 spec refers to RFC 2388 which has been obsoleted by RFC 7578.

There are at least 4 basic scenarios to consider.

  1. Search form
GET /search?field1=aa&field2=bb
  1. Edit form
firstName=aa&lastName=bb
  1. Edit form with form fields and "file" input
------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="firstName"

aa
------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="lastName"

bb
------kYFrd4jNJEgCervE--
Content-Disposition: form-data; name="user-photo"; filename="image123.jpg"
Content-Type: image/jpeg

...image content...
------kYFrd4jNJEgCervE--
  1. Multipart request with JSON and an image
------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="user"
Content-Type: application/json

{"firstName":"aa", "lastName":"bb"}
------kYFrd4jNJEgCervE
Content-Disposition: form-data; name="user-photo"; "filename=image123.jpg"
Content-Type: image/jpeg

...image content...
------kYFrd4jNJEgCervE--

Servlet API and @MVC

The Servlet API has "parameters" that are "contained in the query string or posted form data". It also has "parts" which represents the parts of a multipart request but only those with a filename, from Servlet spec 3.1 section 3.2: "For parts with form-data as the Content-Disposition, but without a filename, the string value of the part will also be available through the getParameter". This separates "simple" form fields from file inputs even in multipart requests but it forces you have a filename in case #4 where the JSON is neither a simple form field nor a file upload.

In Spring MVC @RequestParam can refer to any Servlet request parameter which includes query, form data, and any part (Part, MultipartFile, or String/byte[] via PropertyEditor). @ModelAttribute is the same but binding to the fields of an Object. @RequestPart can refer to any part as "Part" or "MultipartFile" and also as any object via message conversion.

ServerHttpRequest

Let's consider how to expose this at the lowest level. Overall I think we should maintain a clear separation between query params, form params (i.e. a "x-www-form-urlencoded request body), and multiparts which is closer to the protocol representation on the wire. Not blur the lines as in the Servlet API.

We already have getQueryParams() in ServerHttpRequest.

Then add Mono<MultiValueMap<String, String>> getFormParams() which would parse lazily if and only if the content-type is "application/x-www-form-urlencoded" (but not "multipart/form-data"). The idea of exposing it at this level is that it allows different parsing by HTTP runtime (e.g. Reactor Netty supports multipart) and also allows for caching of the result, similar to how we cache the "sessionMono" field in DefaultServerWebExchange. Then we can access form params multiples times, e.g. once per @ModelAttribute argument, much like we can access request params from anywhere with the Servlet API.

Then also add Flux<Multipart> getMultiparts() for basic, sequential access to multiparts. This method may not be easily usable for @RequestPart or @ModelAttribute data binding. However it would provide more direct access where needed for streaming large file(s). This would enable injecting Flux<Multipart> into the controller for example.

A Mono<MultiValueMap<String, Multipart>> getAllMultiparts() method by contrast would lazily parse all multiparts when the content-type is "multipart/form-data". Again the result should be cached so the method can be invoked multiple times. This is what we would use for @RequestPart and @ModelAttribute support.

In the end @RequestParam could remain simply as a way to refer to query params but it wouldn't be hard to access form params or multiparts given the above.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

A draft PR is available for review here.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

While we will use NIO Multipart 1.0 which is not yet fully non-blocking in M3, it is interesting to notice that after discussing with the project lead, a POC based on Reactive Streams should be available in a branch in order to evaluate it as a potential fully non-blocking solution.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

After a meeting with Silvano Riz, the developer of NIO Multipart, we agree on collaborating on a POC that will use Reactor 3 as a foundation to build a truly Reactive Multipart library. The idea is to expose an API like Flux<Part> resolveMultipartRequest(Map<String, List<String>> headers, Publisher<ByteBuffer> body) with Part allowing to get the part body as a Flux<ByteBuffer> or as a Mono<String>.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

I postpone this issue to RC1 since it will require a significant amount of work and testing.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

Please find bellow a summary of current status.

We did not find the time Silvano Riz and me to work on a reactive multipart implementation, so currently we use NIO Multipart API which is event-based BUT use blocking IO (InputStream / OutputStream), see this related discussion. It would be nice to review current bridge between NIO Multipart and Spring Web Reactive to have a better idea of the limitation of the current implementation.

Based on my discussion with Stéphane Maldini and this RxNetty discussion:

  • Current Netty support for multipart can't be plugged easily in RxNetty or Reactor Netty architecture
  • For long term, it would be nice to work with Norman on a way to get Multipart support from Netty that could be plugged in RxNetty and Reactor Netty
  • Current Multipart support on Reactor Netty is mainly client side oriented
  • Even if we add a Flux<ByteBufFlux> MultipartInbound#from(HttpServerRequest inbound) variant, that would only allow us to get the content of the file (no headers, not filename, no encoding support, etc.) so there would non-trivial improvements to support what we need to expose in Spring multipart API.

My current proposal for 5.0 would be:

  • Review NIO Multipart to Spring Web Reactive bridge with the team (including Simon and Stéphane) to identify the impact of the part of the API that is blocking
  • Identify if we should use specific schedulers to run NIO Multipart in order to not block too much Netty threads
  • Decide if we should expose both Mono<MultiValueMap<String, Part>> getFormParts() and Flux<Part> getFormPartsAsFlux(), only the Mono based variant (I tend to think we should provide some kind of multipart support in 5.0 even if it comes with some limitation clearly documented)
  • Keep this issue/PR scope as it is (already quite big) and create 2 distinct issues to provide integration in both annotation and functional API

For 5.1 target, 2 roads could be explored:

  • Try to see if we can move forward on first-class multipart reactive support on RxNetty and Reactor Netty (possibly by refactoring Netty Multipart support, to be discussed with Norman)
  • Collaborate with Silvano Riz on a reactive multipart implementation that could be used independently of the engine used

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

I have rebased the commit on latest master.

Since it seems we don't have a way to write file in a non blocking way, I think the best we can do is using NIO Mulitpart with small chunks of data in order to avoid blocking too much, and using zero copy where we can (already the case in the code I wrote).

Rossen Stoyanchev Arjen Poutsma What is the best way we have to control the size of the body coming from the request to make sure we have for example 8K chunks? Do we already configure that at engine level?

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

I have pushed to master the support for Reactive multipart requests available via ServerWebExchange#getMultipartData(), @RequestParam / @RequestPart annotations or BodyExtractors#toMultipartData() functional API. Reader implementation relies on Synchronoss NIO multipart library. Client support for sending multipart request is also provided.

A demo application is available at https://github.com/sdeleuze/webflux-multipart.

Rossen Stoyanchev Feel free to review/polish this support when you have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: enhancement A general enhancement
Projects
None yet
Development

No branches or pull requests

2 participants