Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement spec-compliant body mixins #1694

Merged
merged 2 commits into from Oct 13, 2022

Conversation

KhafraDev
Copy link
Member

@KhafraDev KhafraDev commented Oct 10, 2022

Improves performance of .text(), .arrayBuffer(), .json(), and .blob() by 60%.

The next step is to introduce a synchronous FormData parser - similar to what every other runtime has. It doesn't make sense to asynchronously parse the FormData if all of the bytes are in memory.

If a library needs asynchronous parsing:

const response = new Response(fd, { headers: [['content-type', 'multipart/formdata']] })

for await (const chunk of response.body) {
  // write chunk to busboy, for example
}

As mentioned in the issue implementing .formData, it's very inefficient and should never be used on the server.

Firefox - https://github.com/mozilla/gecko-dev/blob/7f3ff3f4d34e7d234da4f5f5b345d2add7f30e95/dom/base/BodyUtil.cpp#L67
Chromium - https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/fetch/multipart_parser.cc;l=1;bpv=1;bpt=0
Deno - https://github.com/denoland/deno/blob/0cd05d737729b4cfab1d5e22077b3b9ad4ed5e30/ext/fetch/21_formdata.js#L490
Webkit - https://github.com/WebKit/WebKit/blob/f0350d6575c366d884d98f9937e77fe499b93398/Source/WebCore/Modules/fetch/FetchBodyConsumer.cpp#L129

@codecov-commenter
Copy link

codecov-commenter commented Oct 10, 2022

Codecov Report

Base: 93.98% // Head: 94.05% // Increases project coverage by +0.06% 🎉

Coverage data is based on head (149a6ec) compared to base (eead2b8).
Patch coverage: 98.21% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1694      +/-   ##
==========================================
+ Coverage   93.98%   94.05%   +0.06%     
==========================================
  Files          53       53              
  Lines        4907     4912       +5     
==========================================
+ Hits         4612     4620       +8     
+ Misses        295      292       -3     
Impacted Files Coverage Δ
lib/fetch/body.js 95.79% <98.18%> (-0.40%) ⬇️
lib/fetch/dataURL.js 84.90% <100.00%> (+2.62%) ⬆️
lib/core/connect.js 98.27% <0.00%> (-1.73%) ⬇️
lib/fetch/index.js 83.64% <0.00%> (+0.18%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@KhafraDev
Copy link
Member Author

I will make a follow up PR for the formData parsing, since it's much harder than everything else.

@jimmywarting
Copy link
Contributor

this feels like a worse solution to first consume the hole body first.
an async parser feels like it would be more memory/GC friendlier. You do not always have all the data in the memory.

a .formData() dose not have to be inefficient if blob's could be created with a file system to back them up.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@KhafraDev
Copy link
Member Author

KhafraDev commented Oct 13, 2022

@jimmywarting

this feels like a worse solution to first consume the hole body first.

You do not always have all the data in the memory.

According to the spec, you do have to consume the whole body. See: https://fetch.spec.whatwg.org/#ref-for-fully-reading-body-as-promise%E2%91%A0

With this assumption (that the whole body is in memory), the only logical outcome is that, similar to every other body mixin, the parsing happens synchronously. Every other platform has similarly came up with this solution.

@KhafraDev KhafraDev merged commit 23fbc08 into nodejs:main Oct 13, 2022
@KhafraDev KhafraDev deleted the fix-body-mixins branch October 13, 2022 01:12
@repsac-by
Copy link
Contributor

According to the spec, you do have to consume the whole body. See: https://fetch.spec.whatwg.org/#ref-for-fully-reading-body-as-promise%E2%91%A0

As far as I understand the specification specifies how to give the result, but not how we process it under the hood.

A simple example showing memory consumption when reading the whole body before decoding instead of decoding by chunks.

const { randomFillSync } = require('crypto');
const { ReadableStream } = require('node:stream/web');
const { Request } = require('.');

function stream(length) {
	const buffer = Buffer.alloc(32 * 1024);
	const encoder = new TextEncoder();
	let size = 0;
	return new ReadableStream({
		pull(ctr) {
			const data = encoder.encode(randomFillSync(buffer).toString('base64'));
			ctr.enqueue(data);
			if ((size += data.length) > length) ctr.close();
		}
	})
}

void async function() {
	const request = new Request('http://localhost', {
		method: 'POST',
		duplex: 'half',
		body: stream(256 * 1024 * 1024),
	});

	await request.text();
	console.log(`RSS: ${(process.memoryUsage.rss() / 1024 / 1024).toFixed(2)} MB`);
}();

before

RSS: 365.86 MB

after

RSS: 601.42 MB

metcoder95 pushed a commit to metcoder95/undici that referenced this pull request Dec 26, 2022
* feat: implement spec-compliant body mixins

* fix: skip tests on v16.8
crysmags pushed a commit to crysmags/undici that referenced this pull request Feb 27, 2024
* feat: implement spec-compliant body mixins

* fix: skip tests on v16.8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants