Skip to content
This repository has been archived by the owner on May 26, 2022. It is now read-only.

Memory Leak for short lived connections #293

Closed
dennis-tra opened this issue Nov 17, 2021 · 2 comments
Closed

Memory Leak for short lived connections #293

dennis-tra opened this issue Nov 17, 2021 · 2 comments

Comments

@dennis-tra
Copy link

dennis-tra commented Nov 17, 2021

Hi everyone,

I just observed a memory leak in my network crawler application. Here are some screenshots of a heap dump:

Screenshots Screenshot 2021-11-17 at 15 42 00 Screenshot 2021-11-17 at 15 42 17 Screenshot 2021-11-17 at 16 09 25

I have attached the heap dump to this issue as well (see below).

I'm not super proficient in reading these dumps but libp2p-swarm is at the very top of the chain and that's why I'm posting the issue on this repository. I can see that the allocations actually happen a little further down the chain, however, I presumed I need to release resources through interfaces provided by libp2p-swarm.

In my application, the relevant part for the memory leak periodically tries to dial nodes in the network. These connections are very short lived and I suspect I'm not releasing resources properly. Unfortunately, I don't know how to do it.

I'm copying over the code path where the dials take place. The original can be found here.

func (d *Dialer) handleDialJob(ctx context.Context, pi peer.AddrInfo) Result {
	logEntry := log.WithFields(...)

	// Initialize dial result
	dr := Result{...}

retryLoop:
	for retry := 0; retry < 3; retry++ {
		// Add peer information to peer store so that DialPeer can pick it up from there
		// Do this in every retry due to the TTL of one minute
		d.host.Peerstore().AddAddrs(pi.ID, pi.Addrs, time.Minute)

		// Actually dial the peer
		if err := d.dial(ctx, pi.ID); err != nil {
			dr.Error = err
			dr.DialError = db.DialError(dr.Error)

                        // some error handling logic .... (no return statements here that would skip the `ClosePeer` call below, but some `break retryLoop`s)
                        continue retryLoop
		}

		// Dial was successful - reset error
		dr.Error = nil
		dr.DialError = ""

		break retryLoop
	}

	// Close established connection
	if err := d.host.Network().ClosePeer(pi.ID); err != nil {
		logEntry.WithError(err).Warnln("Could not close connection to peer")
	}

	return dr
}

func (d *Dialer) dial(ctx context.Context, peerID peer.ID) error {
	if _, err := d.host.Network().DialPeer(ctx, peerID); err != nil {
		return err
	}
	return nil
}

At the end I'm calling ClosePeer to close the connections associated with that peer and that's where I called it a day. This doesn't seem to suffice unfortunately :/

Is there anything else I'm missing to close/release?


That's the heap dump: heap.zip

@marten-seemann
Copy link
Contributor

Looks like it could be libp2p/go-tcp-transport#81? That issue would have been fixed by the v0.2.4 release.

@dennis-tra
Copy link
Author

Ah yup, could be 👍 I'm still on 0.2.3. I'll give that a shot and will reopen this issue in case the problem persists 🤞

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants