-
Notifications
You must be signed in to change notification settings - Fork 485
docs: update limits doc #1456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: update limits doc #1456
Conversation
Incorporate feedback from #1421
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @achingbrain .
I left a couple of comments here.
I have a couple of new ones in #1421 based on your responses.
I also don't think I saw answers to these questions from here.
Is it possible to control the max amount of data buffered across all js-libp2p connections in total? If not, is at least possible to set a max buffer limit at the connection level? I'm assuming not. It looks like (maxInboundStreams + maxOutboundStreams) * maxStreamBufferSize is the upper bound that a single connection could consume (if all streams appraoched the maxStreamBufferSize). I wonder if it would be easier for a user to reason about if we had a limit like maxConnectionBuufferSize or maxMuxerBufferSize. Anyways, assuming this is a no go, maybe documenting this a bit more?
LIMITS.md answers specifics about js-libp2p. I think it would be good to link out to https://docs.libp2p.io/reference/dos-mitigation/, which give a wholistic take on DoS mitigation (potentially providing useful context).
doc/LIMITS.md
Outdated
* | ||
* This field is optional, the default value is shown. | ||
* | ||
* It is bytes per second. | ||
*/ | ||
maxData: 1024 * 1024, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this saying that if my sendRate + receiveRate is greater than 1MB/S that connections might get closed? That seems pretty low for a default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where I got that figure from, the default is Infinity
. It's a weird limit tbh, surely you'd want the transfer speeds to be as fast as possible at all times?
If there's so much incoming data performance starts to suffer, the event loop delay detection should kick in.
@achingbrain any chance we can address comments and close this out? |
Yes, but it's the result of a calculation rather than a single config value. Each connection has a multiplexer, each multiplexer has a buffer for raw incoming data, that incoming data is parsed into messages for each stream, each stream has a message buffer. Each multiplexer has a stream limit. The max amount of data is the raw incoming data buffer size, plus the stream message buffers multiplied by the stream limit, all multiplied by the max number of connections.
Yes, but it's controlled by the multiplexers rather than the connection. It's not currently possible to vary the limit between connections, if that's what you mean.
Perhaps, but I think it might be quite hard to start with the total limit and derive useful sublimits. By which I mean, it would be nice to say "each connection can only consume 60MB", but the muxer needs to divide that 60MB and allocate some to the incoming data buffer, some to the stream buffers, etc. I guess you could say "the incoming data buffer is always 4MB" and "stream buffers are always 4MB" so that means of 60MB, 4MB goes to the incoming data buffer, 56MB is left over so I can only have 14 streams so max 7 incoming/7 outgoing? What if you have some streams that are always consumed quickly and others that aren't? You might want to let the slow streams have a bigger buffer than the fast streams. It's all really fiddly - I think it's best if users tune the limits to their particular application needs.
This is linked to in the opening section above the ToC - I've tried to make it a bit more obvious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @achingbrain . Thanks for updating. I think to close this out we should:
- Add a FAQ entry for "how much memory will be used for buffering". Maybe with something like:
There is not a single config value to guarantee js-libp2p only uses X memory in buffers. Important details for ascertaining this:
- Each connection has a multiplexer
- Each multiplexer has a buffer for raw incoming data (what is the config key for this?)
- Multiplexer incoming data is parsed into messages for each stream and queued (muxer.maxUnprocessedMessageQueueSize)
- Each stream has a message buffer in bytes (muxer.maxUnprocessedMessageQueueSize)
- Each multiplexer has a stream limit for number of streams (muxer.maxInboundStreams, muxer.maxOutboundStreams).
As a result, the max amount of memory buffered by libp2p is approximately:
connectionManager.maxConnections *
(multiplexer incoming data buffer size // is this configurable?
+ (muxer.maxInboundStreams * muxer.maxUnprocessedMessageQueueSize)
+ ((muxer.maxInboundStreams + muxer.maxOutboundStreams) * muxer.maxUnprocessedMessageQueueSize)
)
It's all really fiddly - I think it's best if users tune the limits to their particular application needs.
While it's crital that js-libp2pallow low level specifics, I think in general it's easier for users if we can give them high-level knobs that they can reason about, and then plunge into more complexity if needed. We found this for go-libp2p resource manager integration into Kubo. It was useful that Kubo could configure go-libp2p to use a maxMemory by default and be able to pass that to go-libp2p. Basically as a library, we yes need to allow for high configurability but also make the basic cases easy to do as well.
To be clear, I'm not advocating more work here other than being clear about how everything works. Each time I need to engage on this subject, it takes me 15-30 minutes to reboot, and that's after having a lot of historical context on this. I want to make sure all the info/context is present for someone in one place even if we're not providing simpler knobs now.
-
Maybe add a FAQ entry about "Is there a way to limit the number of file descriptors?". We could say there isn't a way to set a hard limit on file-descriptors used by js-libp2p, but setting
connectionManager.maxConnections
is a decent proxy. (This came to mind when thinking about differences with go-libp2p.) -
Close out on the DoS vector mentioned in #1421 (comment) . At the minimum lets create the backlog item.
-
Add more clarity about connection limits per #1421 (comment)
doc/LIMITS.md
Outdated
|
||
/** | ||
* How much incoming data to buffer before resetting the stream | ||
* How much incoming data to buffer while attempting to parse messages - peers sending many small messages in batches may cause this buffer to grow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets be clear about the units here. Are we limiting the number of unprocessed messages in the queue, or limiting the number of bytes of unprocessed messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maxUnprocessedMessageQueueSize
is the incoming raw data that's yet to be processed into messages so it's bytes. I've updated the comment but am also open to better names!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I think a clearer name is maxUnprocessedMessageQueueBytes
.
@achingbrain : can we close this out? Let me know if you want to have a quick sync to finish it off to avoid back and forth. |
There's no FAQ as such yet, but I've added this as a section to the doc
The backlog item is #1508 and it's referenced from the doc now
I've added the suggested change. I think we should merge this now, any extra tweaks can be applied as further PRs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - thanks @achingbrain . I'm good with us merging. It would be great to see us followup with #1508 and #1509 to close out the low hanging fruit and make the limits a bit more robust.
Incorporate feedback from #1421