Skip to content

Commit

Permalink
Merge pull request #75 from libp2p/marco/docs
Browse files Browse the repository at this point in the history
Add package docs
  • Loading branch information
MarcoPolo committed Aug 11, 2022
2 parents ba3117d + ab86eb7 commit 79dba76
Show file tree
Hide file tree
Showing 2 changed files with 175 additions and 84 deletions.
250 changes: 166 additions & 84 deletions p2p/host/resource-manager/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,6 @@ The implementation is based on the concept of Resource Management
Scopes, whereby resource usage is constrained by a DAG of scopes,
accounting for multiple levels of resource constraints.

## Design Considerations

- The Resource Manager must account for basic resource usage at all
levels of the stack, from the internals to application components
that use the network facilities of libp2p.
- Basic resources include memory, streams, connections, and file
descriptors. These account for both space and time used by
the stack, as each resource has a direct effect on the system
availability and performance.
- The design must support seamless integration for user applications,
which should reap the benefits of resource management without any
changes. That is, existing applications should be oblivious of the
resource manager and transparently obtain limits which protect it
from resource exhaustion and OOM conditions.
- At the same time, the design must support opt-in resource usage
accounting for applications who want to explicitly utilize the
facilities of the system to inform about and constrain their own
resource usage.
- The design must allow the user to set its own limits, which can be
static (fixed) or dynamic.


## Basic Resources

### Memory
Expand All @@ -51,7 +29,7 @@ computational time) at the system level. They are also a scarce
resource, as typically (unless the user explicitly intervenes) they
are constrained by the system. Exhaustion of file descriptors may
render the application incapable of operating (e.g. because it is
unable to open a file), most importantly for libp2p because most
unable to open a file), this is important for libp2p because most
operating systems represent sockets as file descriptors.

### Connections
Expand Down Expand Up @@ -207,7 +185,7 @@ scope.
### User Transaction Scopes

User transaction scopes can be created as a child of any extant
resource scope, and provide the prgrammer with a delimited scope for
resource scope, and provide the programmer with a delimited scope for
easy resource accounting. Transactions may form a tree that is rooted
to some canonical scope in the scope DAG.

Expand All @@ -230,65 +208,6 @@ limits for the system and transient scopes, default and specific
limits for services, protocols, and peers, and limits for connections
and streams.

## Examples

Here we consider some concrete examples that can ellucidate the abstract
design as described so far.

### Stream Lifetime

Let's consider a stream and the limits that apply to it.
When the stream scope is first opened, it is created by calling
`ResourceManager.OpenStream`.

Initially the stream is constrained by:
- the system scope, where global hard limits apply.
- the transient scope, where unnegotiated streams live.
- the peer scope, where the limits for the peer at the other end of the stream
apply.

Once the protocol has been negotiated, the protocol is set by calling
`StreamManagementScope.SetProtocol`. The constraint from the
transient scope is removed and the stream is now constrained by the
protocol instead.

More specifically, the following constraints apply:
- the system scope, where global hard limits apply.
- the peer scope, where the limits for the peer at the other end of the stream
apply.
- the protocol scope, where the limits of the specific protocol used apply.

The existence of the protocol limit allows us to implicitly constrain
streams for services that have not been ported to the resource manager
yet. Once the programmer attaches a stream to a service by calling
`StreamScope.SetService`, the stream resources are aggregated and constrained
by the service scope in addition to its protocol scope.

More specifically the following constraints apply:
- the system scope, where global hard limits apply.
- the peer scope, where the limits for the peer at the other end of the stream
apply.
- the service scope, where the limits of the specific service owning the stream apply.
- the protcol scope, where the limits of the specific protocol for the stream apply.


The resource transfer that happens in the `SetProtocol` and `SetService`
gives the opportunity to the resource manager to gate the streams. If
the transfer results in exceeding the scope limits, then a error
indicating "resource limit exceeded" is returned. The wrapped error
includes the name of the scope rejecting the resource acquisition to
aid understanding of applicable limits. Note that the (wrapped) error
implements `net.Error` and is marked as temporary, so that the
programmer can handle by backoff retry.

## Usage

This package provides a limiter implementation that applies fixed limits:
```go
limiter := NewFixedLimiter(limits)
```
The `limits` allows fine-grained control of resource usage on all scopes.

### Scaling Limits

When building software that is supposed to run on many different kind of machines,
Expand All @@ -312,7 +231,8 @@ memory to libp2p.

For convenience, the `ScalingLimitConfig` also provides an `AutoScale` method,
which determines the amount of memory and file descriptors available on the
system, and dedicates up to 1/8 of the memory and 1/2 of the file descriptors to libp2p.
system, and dedicates up to 1/8 of the memory and 1/2 of the file descriptors to
libp2p.

For example, one might set:
```go
Expand Down Expand Up @@ -354,6 +274,147 @@ this would result in a limit of 1256 file descriptors for the system scope.
Note that we only showed the configuration for the system scope here, equivalent
configuration options apply to all other scopes as well.

### Default limits

By default the resource manager ships with some reasonable scaling limits and
makes a reasonable guess at how much system memory you want to dedicate to the
go-libp2p process. For the default definitions see `DefaultLimits` and
`ScalingLimitConfig.AutoScale()`.

### Tweaking Defaults

If the defaults seem mostly okay, but you want to adjust one facet you can do
simply copy the default struct object and update the field you want to change. You can
apply changes to a `BaseLimit`, `BaseLimitIncrease`, and `LimitConfig` with
`.Apply`.

Example
```
// An example on how to tweak the default limits
tweakedDefaults := DefaultLimits
tweakedDefaults.ProtocolBaseLimit.Apply(BaseLimit{
Streams: 1024,
StreamsInbound: 512,
StreamsOutbound: 512,
})
```

### How to tune your limits

Once you've set your limits and monitoring (see [Monitoring](#monitoring) below) you can now tune your
limits better. The `blocked_resources` metric will tell you what was blocked
and for what scope. If you see a steady stream of these blocked requests it
means your resource limits are too low for your usage. If you see a rare sudden
spike, this is okay and it means the resource manager protected you from some
anamoly.

### How to disable limits

Sometimes disabling all limits is useful when you want to see how much
resources you use during normal operation. You can then use this information to
define your initial limits. Disable the limits by using `InfiniteLimits`.

### Debug "resource limit exceeded" errors

These errors occur whenever a limit is hit. For example you'll get this error if
you are at your limit for the number of streams you can have, and you try to
open one more.

If you're seeing a lot of "resource limit exceeded" errors take a look at the
`blocked_resources` metric for some information on what was blocked. Also take
a look at the resources used per stream, and per protocol (the Grafana
Dashboard is ideal for this) and check if you're routinely hitting limits or if
these are rare (but noisy) spikes.

When debugging in general, in may help to search your logs for errors that match
the string "resource limit exceeded" to see if you're hitting some limits
routinely.

## Monitoring

Once you have limits set, you'll want to monitor to see if you're running into
your limits often. This could be a sign that you need to raise your limits
(your process is more intensive than you originally thought) or that you need
fix something in your application (surely you don't need over 1000 streams?).

There are OpenCensus metrics that can be hooked up to the resource manager. See
`obs/stats_test.go` for an example on how to enable this, and `DefaultViews` in
`stats.go` for recommended views. These metrics can be hooked up to Prometheus
or any other OpenCensus supported platform.

There is also an included Grafana dashboard to help kickstart your
observability into the resource manager. Find more information about it at
`./obs/grafana-dashboards/README.md`.

## Allowlisting multiaddrs to mitigate eclipse attacks

If you have a set of trusted peers and IP addresses, you can use the resource
manager's [Allowlist](./docs/allowlist.md) to protect yourself from eclipse
attacks. The set of peers in the allowlist will have their own limits in case
the normal limits are reached. This means you will always be able to connect to
these trusted peers even if you've already reached your system limits.

Look at `WithAllowlistedMultiaddrs` and its example in the GoDoc to learn more.

## Examples

Here we consider some concrete examples that can ellucidate the abstract
design as described so far.

### Stream Lifetime

Let's consider a stream and the limits that apply to it.
When the stream scope is first opened, it is created by calling
`ResourceManager.OpenStream`.

Initially the stream is constrained by:
- the system scope, where global hard limits apply.
- the transient scope, where unnegotiated streams live.
- the peer scope, where the limits for the peer at the other end of the stream
apply.

Once the protocol has been negotiated, the protocol is set by calling
`StreamManagementScope.SetProtocol`. The constraint from the
transient scope is removed and the stream is now constrained by the
protocol instead.

More specifically, the following constraints apply:
- the system scope, where global hard limits apply.
- the peer scope, where the limits for the peer at the other end of the stream
apply.
- the protocol scope, where the limits of the specific protocol used apply.

The existence of the protocol limit allows us to implicitly constrain
streams for services that have not been ported to the resource manager
yet. Once the programmer attaches a stream to a service by calling
`StreamScope.SetService`, the stream resources are aggregated and constrained
by the service scope in addition to its protocol scope.

More specifically the following constraints apply:
- the system scope, where global hard limits apply.
- the peer scope, where the limits for the peer at the other end of the stream
apply.
- the service scope, where the limits of the specific service owning the stream apply.
- the protcol scope, where the limits of the specific protocol for the stream apply.


The resource transfer that happens in the `SetProtocol` and `SetService`
gives the opportunity to the resource manager to gate the streams. If
the transfer results in exceeding the scope limits, then a error
indicating "resource limit exceeded" is returned. The wrapped error
includes the name of the scope rejecting the resource acquisition to
aid understanding of applicable limits. Note that the (wrapped) error
implements `net.Error` and is marked as temporary, so that the
programmer can handle by backoff retry.

## Usage

This package provides a limiter implementation that applies fixed limits:
```go
limiter := NewFixedLimiter(limits)
```
The `limits` allows fine-grained control of resource usage on all scopes.

## Implementation Notes

- The package only exports a constructor for the resource manager and
Expand All @@ -366,3 +427,24 @@ configuration options apply to all other scopes as well.
pointer to a generic resource scope.
- Peer and Protocol scopes, which may be created in response to
network events, are periodically garbage collected.

## Design Considerations

- The Resource Manager must account for basic resource usage at all
levels of the stack, from the internals to application components
that use the network facilities of libp2p.
- Basic resources include memory, streams, connections, and file
descriptors. These account for both space and time used by
the stack, as each resource has a direct effect on the system
availability and performance.
- The design must support seamless integration for user applications,
which should reap the benefits of resource management without any
changes. That is, existing applications should be oblivious of the
resource manager and transparently obtain limits which protect it
from resource exhaustion and OOM conditions.
- At the same time, the design must support opt-in resource usage
accounting for applications who want to explicitly utilize the
facilities of the system to inform about and constrain their own
resource usage.
- The design must allow the user to set its own limits, which can be
static (fixed) or dynamic.
9 changes: 9 additions & 0 deletions p2p/host/resource-manager/limit.go
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
/*
Package rcmgr is the resource manager for go-libp2p. This allows you to track
resources being used throughout your go-libp2p process. As well as making sure
that the process doesn't use more resources than what you define as your
limits. The resource manager only knows about things it is told about, so it's
the responsibility of the user of this library (either go-libp2p or a go-libp2p
user) to make sure they check with the resource manager before actually
allocating the resource.
*/
package rcmgr

import (
Expand Down

0 comments on commit 79dba76

Please sign in to comment.