Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultipartParser emits DataBufferLimitException about "Part headers exceeded the memory usage limit" unexpectedly #27612

Closed
Reginald-Yoeng-Lee opened this issue Oct 27, 2021 · 5 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: bug A general bug
Milestone

Comments

@Reginald-Yoeng-Lee
Copy link

Affects: \Up to 5.3.12


Here's a web application based on Spring Webflux.
When receiving multipart/form-data POST requests from clients, the body parts will be parsed by org.springframework.http.codec.multipart.MultipartParser. As reading the source code, we could find that the main logic of the MultipartParser is a state machine. It changes state between the state of header parsing and the state of body parsing for most time. Now take the following request body as an example:

------WebKitFormBoundarySOIUhndhpIQUSFc1
Content-Disposition: form-data; name="size"

25015919
------WebKitFormBoundarySOIUhndhpIQUSFc1
Content-Disposition: form-data; name="parentId"


------WebKitFormBoundarySOIUhndhpIQUSFc1 Content-Disposition: form-data; name="file"; filename="data.zip" Content-Type: application/x-zip-compressed

(binary data)
------WebKitFormBoundarySOIUhndhpIQUSFc1--

Upstream provides DataBuffers to MultipartParser sequently. MultipartParser handles these DataBuffers and transit among states. When parsing the body above, PREAMBLE -> HEADERS -> BODY -> HEADERS -> BODY -> HEADERS -> BODY -> DISPOSED should be performed. First, we get header Content-Disposition: form-data; name="size", then body 25015919, then header Content-Disposition: form-data; name="parentId", so far so good.
But here comes up a malformed situation. Let's say if the current DataBuffer contains the value of the "parentId" part (which is an empty string), and ends with the boundary plus CRLF. i.e. the byte array under the DataBuffer should be looked like:
[..., 45, 45, 45, 45, 45, 45, 87, 101, 98, 75, 105, 116, 70, 111, 114, 109, 66, 111, 117, 110, 100, 97, 114, 121, 83, 79, 73, 85, 104, 110, 100, 104, 112, 73, 81, 85, 83, 70, 99, 49, 13, 10]

Now let's check out the part of codes of BodyState about handling the body of the part.

int endIdx = this.boundary.match(buffer);
if (endIdx != -1) {
	if (logger.isTraceEnabled()) {
		logger.trace("Boundary found @" + endIdx + " in " + buffer);
	}
	int len = endIdx - buffer.readPosition() - this.boundary.delimiter().length + 1;
	if (len > 0) {
		// buffer contains complete delimiter, let's slice it and flush it
		DataBuffer body = buffer.retainedSlice(buffer.readPosition(), len);
		enqueue(body);
		enqueue(null);
	}
	else if (len < 0) {
		// buffer starts with the end of the delimiter, let's slice the previous buffer and flush it
		DataBuffer previous = this.previous.get();
		int prevLen = previous.readableByteCount() + len;
		if (prevLen > 0) {
                        DataBuffer body = previous.retainedSlice(previous.readPosition(), prevLen);
                        DataBufferUtils.release(previous);
                        this.previous.set(body);
                        enqueue(null);
		}
		else {
                        DataBufferUtils.release(previous);
                        this.previous.set(null);
		}
	}
	else /* if (sliceLength == 0) */ {
		// buffer starts with complete delimiter, flush out the previous buffer
		enqueue(null);
	}

	DataBuffer remainder = MultipartUtils.sliceFrom(buffer, endIdx);
	DataBufferUtils.release(buffer);

	changeState(this, new HeadersState(), remainder);
}
else {
	enqueue(buffer);
	requestBuffer();
}

Since the current buffer contains the complete boundary, the endIdx should be greater than -1. In this example the value of endIdx should be the length of the buffer minus 2 (exclude the tail CRLF). As we could see the remainder is sliced from the original buffer started from position endIdx + 1. That means the remainder contains two bytes of CRLF in this example. Next, the state is changed to HEADERS.
Let's check out the part of codes of HeadersState

long prevCount = this.byteCount.get();
long count = this.byteCount.addAndGet(buf.readableByteCount());
if (prevCount < 2 && count >= 2) {
	if (isLastBoundary(buf)) {
		if (logger.isTraceEnabled()) {
			logger.trace("Last boundary found in " + buf);
		}

		if (changeState(this, DisposedState.INSTANCE, buf)) {
			emitComplete();
		}
		return;
	}
}
else if (count > MultipartParser.this.maxHeadersSize) {
	if (changeState(this, DisposedState.INSTANCE, buf)) {
		emitError(new DataBufferLimitException("Part headers exceeded the memory usage limit of " +
				MultipartParser.this.maxHeadersSize + " bytes"));
	}
	return;
}
int endIdx = this.endHeaders.match(buf);
if (endIdx != -1) {
	if (logger.isTraceEnabled()) {
		logger.trace("End of headers found @" + endIdx + " in " + buf);
	}
	DataBuffer headerBuf = MultipartUtils.sliceTo(buf, endIdx);
	this.buffers.add(headerBuf);
	DataBuffer bodyBuf = MultipartUtils.sliceFrom(buf, endIdx);
	DataBufferUtils.release(buf);

	emitHeaders(parseHeaders());
	changeState(this, new BodyState(), bodyBuf);
}
else {
	this.buffers.add(buf);
	requestBuffer();
}

The buf here is just exactly the same as the remainder sliced from the buffer during the previous state BODY, i.e. contains two bytes of CRLF. For now the value of prevCount is 0, count is 2, and this.byteCount is added to 2. Since the buf doesn't contain the endHeaders, endIdx is -1, and another DataBuffer is requested at line 2 count from backward.
When new DataBuffer is provided, the codes above will be excuted again. This time since the this.byteCount is added to 2, the prevCount is assigned to 2. By default the readableByteCount() of the fresh DataBuffer could be (probably) up to 8192 (8KB), so we assume that here the readable byte count of the buf is 8192, and the count is added to 8194. However, the default value of MultipartParser.this.maxHeadersSize is set to 8192, so the condition of the second if is matched. DataBufferLimitException("Part headers exceeded the memory usage limit of " + MultipartParser.this.maxHeadersSize + " bytes") is emitted. This is apperently not the expected behavior.

This issue could not be 100% reproduced, but occasionally occurs after serveral (plenty?) attempts. The temporary workaround is set maxHeadersSize to 8194. It's an ugly workaround. In my opinion the default setting should be work well for most cases. So probably I'm using it in a wrong way or I've misunderstood something. Correct me if I make a mistake.

Thanks for any help.

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Oct 27, 2021
@poutsma poutsma self-assigned this Oct 27, 2021
@poutsma poutsma added in: web Issues in web modules (web, webmvc, webflux, websocket) type: bug A general bug and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Oct 27, 2021
@poutsma poutsma added this to the 5.3.13 milestone Oct 27, 2021
@poutsma
Copy link
Contributor

poutsma commented Oct 27, 2021

Hi, thank you for the detailed bug report. With that in hand, I tried to make a couple of changes that should prevent this situation from occurring again (0416168). Would it be possible for you to try a recent snapshot (see https://repo.spring.io/ui/native/snapshot/org/springframework/spring/5.3.13-SNAPSHOT), and see if that resolves the problem?

@Reginald-Yoeng-Lee
Copy link
Author

@poutsma
Great thanks for the help. I'd try this snapshot asap. I believe this commit should work. But as you know, the tricky thing is that this bug can't be 100% reproduced. Anyway, I'll let you know if this problem comes up again (which I'm definitely sure it won't happen).

poutsma added a commit that referenced this issue Nov 2, 2021
This commit introduces the Part::delete method, that deletes its
underlying storage.

Closes gh-27612
@poutsma
Copy link
Contributor

poutsma commented Nov 2, 2021

Note that commit 694db22 had an incorrect commit message and is unrelated to this issue.

@djouvin
Copy link

djouvin commented Nov 28, 2021

Hi, I had the same problem and was preparing to propose a merge request. The bug is easy to reproduce with specially crafted MTOM messages for example, so that the header of a part is spaning two databuffers. The correction seems to work fine (I was not able to reproduce the bug with it).

Thanks.

@poutsma
Copy link
Contributor

poutsma commented Nov 29, 2021

@djouvin Thanks for letting us know! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: bug A general bug
Projects
None yet
Development

No branches or pull requests

4 participants