Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max "op" size is 16384 which can be inefficient in user land #10157

Closed
kitsonk opened this issue Apr 13, 2021 · 9 comments
Closed

Max "op" size is 16384 which can be inefficient in user land #10157

kitsonk opened this issue Apr 13, 2021 · 9 comments
Labels
bug Something isn't working

Comments

@kitsonk
Copy link
Contributor

kitsonk commented Apr 13, 2021

Currently because of tokio-rs/tokio#1976 the maximum "chunk" size that can be handled over an op is 16384, irrespective if this is efficient or not for the consumer.

For example when reading a large file off the disk (50 MiB for example) that would be some 3200 reads via .read() with 3200 promises to resolve.

Then consider serving that file over HTTP in 3200 16k chunks. While the actual over the wire may need to be chunked up that is something that could/should be avoided in the isolate and handled in Rust ideally.

@kitsonk kitsonk added bug Something isn't working upstream Changes in upstream are required to solve these issues labels Apr 13, 2021
@bnoordhuis
Copy link
Contributor

What would you suggest we do if tokio limitations weren't a concern? Pick a bigger chunk size or something else?

@kitsonk
Copy link
Contributor Author

kitsonk commented Apr 13, 2021

Ideally, we would do some testing and optimise it. It might be platform dependent, I don't know, but I can tell you 16k isn't great for files coming off disk and going out via HTTP via JavaScript. Some sort of bypass might work too. But think about the 3200 read ops that create promises that resolve that then get sent out as 3200 (at least) promises to a ReadableStream and whatever ops and promises to get it back out from the Response.

@bnoordhuis
Copy link
Contributor

There are so many confounding factors that an optimal chunk size won't exist. Whatever number comes out of a benchmark will be a product of that particular machine and point in time.

If the goal is to optimize file-to-file or file-to-socket transfers with no intermediate processing (e.g. zipping), then system calls like sendfile(2) and splice(2) are good solutions. Tokio integration is a hurdle but not intractable.

@sswam
Copy link

sswam commented May 16, 2021

I don't know about the promises, but Linux can do read pretty quickly. All languages I know use a fairly small buffer / chunk size when reading... (but I learned something, see below)

I timed a C program doing 16K reads just now, my Linux PC can do approx 500000 such reads per second, i.e. it can read 1GB file in 0.10 seconds, 16K at a time. Using a 1MB buffer is 20% faster, it takes 0.08 seconds. A 4K buffer takes 0.15 seconds, 50% slower. It's very fast whichever way we do it. In my opinion it's unwise to use a large buffer unless you need to store that much data, because it will waste RAM.

Python and Perl use an 8K buffer.

Go seems to use increasingly large reads with no limit, when reading a single line that is 1GB long. But C with a 16K or even a 4K buffer is much faster, because Go is doing weird stuff with multiple threads.

@devalexqt
Copy link

For me it's critical problem! It's slow write and read.

@littledivy
Copy link
Member

The op buffer size is now bumped to 64K (max TCP packet size) in #14319. We should also eventually move over "read all" ops to do a single read.

@JetLua
Copy link

JetLua commented Apr 21, 2022

1.21.0 read file 2-3x faster than 1.20.6, but is still slower than Nodejs

@littledivy littledivy removed the upstream Changes in upstream are required to solve these issues label Apr 22, 2022
@JetLua
Copy link

JetLua commented Apr 29, 2022

This problem seems to have been fixed in v1.21.1.

@kitsonk kitsonk closed this as completed Apr 24, 2024
@Darksonn
Copy link

Note that Tokio now has File::set_max_buf_size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants