Skip to content

docs: update limits doc #1456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 6, 2022
Merged

docs: update limits doc #1456

merged 8 commits into from
Dec 6, 2022

Conversation

achingbrain
Copy link
Member

Incorporate feedback from #1421

Incorporate feedback from #1421
@achingbrain achingbrain requested a review from BigLep October 19, 2022 17:43
Copy link
Contributor

@BigLep BigLep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @achingbrain .

I left a couple of comments here.

I have a couple of new ones in #1421 based on your responses.

I also don't think I saw answers to these questions from here.

Is it possible to control the max amount of data buffered across all js-libp2p connections in total? If not, is at least possible to set a max buffer limit at the connection level? I'm assuming not. It looks like (maxInboundStreams + maxOutboundStreams) * maxStreamBufferSize is the upper bound that a single connection could consume (if all streams appraoched the maxStreamBufferSize). I wonder if it would be easier for a user to reason about if we had a limit like maxConnectionBuufferSize or maxMuxerBufferSize. Anyways, assuming this is a no go, maybe documenting this a bit more?

LIMITS.md answers specifics about js-libp2p. I think it would be good to link out to https://docs.libp2p.io/reference/dos-mitigation/, which give a wholistic take on DoS mitigation (potentially providing useful context).

doc/LIMITS.md Outdated
*
* This field is optional, the default value is shown.
*
* It is bytes per second.
*/
maxData: 1024 * 1024,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this saying that if my sendRate + receiveRate is greater than 1MB/S that connections might get closed? That seems pretty low for a default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where I got that figure from, the default is Infinity. It's a weird limit tbh, surely you'd want the transfer speeds to be as fast as possible at all times?

If there's so much incoming data performance starts to suffer, the event loop delay detection should kick in.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
@p-shahi
Copy link
Member

p-shahi commented Nov 18, 2022

@achingbrain any chance we can address comments and close this out?

@achingbrain
Copy link
Member Author

Is it possible to control the max amount of data buffered across all js-libp2p connections in total?

Yes, but it's the result of a calculation rather than a single config value. Each connection has a multiplexer, each multiplexer has a buffer for raw incoming data, that incoming data is parsed into messages for each stream, each stream has a message buffer. Each multiplexer has a stream limit.

The max amount of data is the raw incoming data buffer size, plus the stream message buffers multiplied by the stream limit, all multiplied by the max number of connections.

If not, is at least possible to set a max buffer limit at the connection level? I'm assuming not.

Yes, but it's controlled by the multiplexers rather than the connection. It's not currently possible to vary the limit between connections, if that's what you mean.

I wonder if it would be easier for a user to reason about if we had a limit like maxConnectionBufferSize or maxMuxerBufferSize.

Perhaps, but I think it might be quite hard to start with the total limit and derive useful sublimits. By which I mean, it would be nice to say "each connection can only consume 60MB", but the muxer needs to divide that 60MB and allocate some to the incoming data buffer, some to the stream buffers, etc.

I guess you could say "the incoming data buffer is always 4MB" and "stream buffers are always 4MB" so that means of 60MB, 4MB goes to the incoming data buffer, 56MB is left over so I can only have 14 streams so max 7 incoming/7 outgoing? What if you have some streams that are always consumed quickly and others that aren't? You might want to let the slow streams have a bigger buffer than the fast streams.

It's all really fiddly - I think it's best if users tune the limits to their particular application needs.

LIMITS.md answers specifics about js-libp2p. I think it would be good to link out to docs.libp2p.io/reference/dos-mitigation, which give a wholistic take on DoS mitigation (potentially providing useful context).

This is linked to in the opening section above the ToC - I've tried to make it a bit more obvious.

Copy link
Contributor

@BigLep BigLep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @achingbrain . Thanks for updating. I think to close this out we should:

  1. Add a FAQ entry for "how much memory will be used for buffering". Maybe with something like:

There is not a single config value to guarantee js-libp2p only uses X memory in buffers. Important details for ascertaining this:

  • Each connection has a multiplexer
  • Each multiplexer has a buffer for raw incoming data (what is the config key for this?)
  • Multiplexer incoming data is parsed into messages for each stream and queued (muxer.maxUnprocessedMessageQueueSize)
  • Each stream has a message buffer in bytes (muxer.maxUnprocessedMessageQueueSize)
  • Each multiplexer has a stream limit for number of streams (muxer.maxInboundStreams, muxer.maxOutboundStreams).

As a result, the max amount of memory buffered by libp2p is approximately:

connectionManager.maxConnections * 
  (multiplexer incoming data buffer size // is this configurable?
   + (muxer.maxInboundStreams * muxer.maxUnprocessedMessageQueueSize)
   + ((muxer.maxInboundStreams  + muxer.maxOutboundStreams) * muxer.maxUnprocessedMessageQueueSize)
  )

It's all really fiddly - I think it's best if users tune the limits to their particular application needs.

While it's crital that js-libp2pallow low level specifics, I think in general it's easier for users if we can give them high-level knobs that they can reason about, and then plunge into more complexity if needed. We found this for go-libp2p resource manager integration into Kubo. It was useful that Kubo could configure go-libp2p to use a maxMemory by default and be able to pass that to go-libp2p. Basically as a library, we yes need to allow for high configurability but also make the basic cases easy to do as well.

To be clear, I'm not advocating more work here other than being clear about how everything works. Each time I need to engage on this subject, it takes me 15-30 minutes to reboot, and that's after having a lot of historical context on this. I want to make sure all the info/context is present for someone in one place even if we're not providing simpler knobs now.

  1. Maybe add a FAQ entry about "Is there a way to limit the number of file descriptors?". We could say there isn't a way to set a hard limit on file-descriptors used by js-libp2p, but setting connectionManager.maxConnections is a decent proxy. (This came to mind when thinking about differences with go-libp2p.)

  2. Close out on the DoS vector mentioned in #1421 (comment) . At the minimum lets create the backlog item.

  3. Add more clarity about connection limits per #1421 (comment)

doc/LIMITS.md Outdated

/**
* How much incoming data to buffer before resetting the stream
* How much incoming data to buffer while attempting to parse messages - peers sending many small messages in batches may cause this buffer to grow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets be clear about the units here. Are we limiting the number of unprocessed messages in the queue, or limiting the number of bytes of unprocessed messages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maxUnprocessedMessageQueueSize is the incoming raw data that's yet to be processed into messages so it's bytes. I've updated the comment but am also open to better names!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I think a clearer name is maxUnprocessedMessageQueueBytes.

@BigLep
Copy link
Contributor

BigLep commented Dec 2, 2022

@achingbrain : can we close this out? Let me know if you want to have a quick sync to finish it off to avoid back and forth.

@achingbrain
Copy link
Member Author

Maybe add a FAQ entry about "Is there a way to limit the number of file descriptors?".

There's no FAQ as such yet, but I've added this as a section to the doc

Close out on the DoS vector mentioned in #1421 (comment) . At the minimum lets create the backlog item.

The backlog item is #1508 and it's referenced from the doc now

Add more clarity about connection limits per #1421 (comment)

I've added the suggested change.

I think we should merge this now, any extra tweaks can be applied as further PRs?

Copy link
Contributor

@BigLep BigLep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - thanks @achingbrain . I'm good with us merging. It would be great to see us followup with #1508 and #1509 to close out the low hanging fruit and make the limits a bit more robust.

@achingbrain achingbrain merged commit 2d984ed into master Dec 6, 2022
@achingbrain achingbrain deleted the docs/update-limits-doc branch December 6, 2022 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants