Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IncomingConnectionsLimit limits short lived connections #5766

Open
urtho opened this issue Oct 1, 2023 · 7 comments
Open

IncomingConnectionsLimit limits short lived connections #5766

urtho opened this issue Oct 1, 2023 · 7 comments

Comments

@urtho
Copy link

urtho commented Oct 1, 2023

Status

IncomingConnectionsLimit param is designed to limit longed lived connections (websocket) only.
With the current implementation is rejects requests for catchups and blocks on gossip port.
Also makes it impossible to monitor relay health if IncomingConnectionsLimit is reached.

Expected

IncomingConnectionsLimit should limit concurrent websocket connections only.

@ohill
Copy link
Contributor

ohill commented Oct 13, 2023

I believe the IncomingConnectionsLimit is also there to limit and provision resource usage in general, for example to ensure you have enough file descriptors at startup time in daemon/algod/server.go. Allowing unlimited connections on the gossip port could result in exhausting the FD limit, or using too much memory, while serving many concurrent non-websocket HTTP requests.

To support monitoring when IncomingConnectionsLimit is reached, I could imagine having two tiers of limits, instead of just one. However since even a higher "max connections" vs "max websocket connections" limit could still be exhausted, perhaps another approach would be some simple authentication (e.g. using an HTTP header) to let a monitoring check bypass the IncomingConnectionsLimit.

Also, since websockets, block, and catchpoint file requests all use different URLs, these different limits and authorization techniques for each type of request could probably also be implemented inside a reverse HTTP proxy. I don't know if many node operators run behind proxies.

In all I'm not sure it is a good idea to allow unlimited parallel HTTP requests, and preserving a general capability to apply connection limits seems important.

@urtho
Copy link
Author

urtho commented Oct 13, 2023

IncomingConnectionsLimit is set as an Accept socket limit on gossip port - so TCP stack rejects any requests when gossip is already at IncomingConnectionsLimit - including catchup/block/health

There are separate checks later in the code path that make sure IncomingConnectionsLimit is not exceeded.
My request is to set the limit on the gossip socket Accept to wn.config.IncomingConnectionsLimit+wn.config.RestConnectionsSoftLimi

The REST/Websocket limits are enforced in other places.
Here is the fix:

4024245

@ohill
Copy link
Contributor

ohill commented Oct 16, 2023

But the REST API doesn't use the NetAddress ("gossip") port, it has its own port (EndpointAddress, implemented in daemon/algod/api/server and provided with a RejectingLimitListener in daemon/algod/server.go)? AFAIK the websocket port doesn't share any resources with the REST API port?

@ohill
Copy link
Contributor

ohill commented Oct 16, 2023

It is confusing because the NetAddress port is used to serve the gossip HTTP path (/v1/{genesisID}/gossip), as well as HTTP paths for requesting blocks (/v1/{genesisID}/block/X) and catchpoint files (/v1/{genesisID}/ledger/X) during catchup or fast catchup, and uses the configuration IncomingConnectionsLimit. But the REST API (EndpointAddress) is another HTTP service on a different port altogether and uses the RestConnectionsSoftLimit/HardLimit configuration to implement the Swagger API spec.

@ohill
Copy link
Contributor

ohill commented Oct 16, 2023

I think you are maybe thinking along the lines of having separate limits for /v1/genesisID/gossip vs /v1/genesisID/{block,ledger}/X? Even with that new limit, it could still get maxed out ... so you might benefit instead from some kind of auth scheme for your monitoring agent? Just trying to think out loud about what would be an effective solution for your situation

@urtho
Copy link
Author

urtho commented Oct 17, 2023

I do not care about the REST API /v2/ endpoints. Only solving a relay issue here.

IncomingConnectionsLimit is supposed to limit long lived connections and this websocket connections limit (up to IncomingConnectionsLimit) is enforced later in the code path.

The same socket serves short lived connections - if not for catchups and blocks then for stuff like the AF relay monitoring.

If I try to protect my relay against an overload by long lived gossip/websocket connections I also make it unavailable to the AF monitoring and basic catchup/block functions.

This might not be critical now but once relays go P2P users might want to set the limit low but still allow others to bootstrap off off them while the gossip limiter is in place.

So either change the description for IncomingConnectionsLimit to ANY connections on gossip port or add extra room on the socket Accept to allow for vital short lived functions and make IncomingConnectionsLimit indeed limit long lived connections only.

@ohill
Copy link
Contributor

ohill commented Oct 18, 2023

You are right that the config description is wrong, IncomingConnectionsLimit is the limit for all incoming connections to NetAddress — just updated to make the description better match the name in #5789.

You could also consider adding a separate limit like IncomingGossipConnectionsLimit or something to provide extra room like you propose, or the idea I proposed for authenticating monitoring requests (in case you saturate the short-lived connections too).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants