Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connmgr: Concurrent Map Misuse #1847

Closed
p-shahi opened this issue Oct 31, 2022 · 5 comments · Fixed by #1860
Closed

connmgr: Concurrent Map Misuse #1847

p-shahi opened this issue Oct 31, 2022 · 5 comments · Fixed by #1860
Assignees
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@p-shahi
Copy link
Member

p-shahi commented Oct 31, 2022

Description

Reported in: ipfs/kubo#9374 (comment)

When running Kubo 0.16.0 users report a crash.
There is a provided stacktrace:

WebUI: http://127.0.0.1:5001/webui
Gateway (readonly) server listening on /ip4/127.0.0.1/tcp/8080
Daemon is ready
fatal error: concurrent map iteration and map write

goroutine 222 [running]:
github.com/libp2p/go-libp2p/p2p/net/connmgr.peerInfos.SortByValueAndStreams.func1.1(0xc0179d00f0?)
        github.com/libp2p/go-libp2p@v0.23.2/p2p/net/connmgr/connmgr.go:267 +0x6a
github.com/libp2p/go-libp2p/p2p/net/connmgr.peerInfos.SortByValueAndStreams.func1(0x16b?, 0x21b?)
        github.com/libp2p/go-libp2p@v0.23.2/p2p/net/connmgr/connmgr.go:277 +0x127
sort.partition_func({0xc0119d9370?, 0xc003e967e0?}, 0x0, 0x2d4, 0x32?)
        sort/zsortfunc.go:157 +0x191

https://github.com/libp2p/go-libp2p/blob/master/p2p/net/connmgr/connmgr.go#L255

Versions affected

Kubo 0.16.0 uses go-libp2p v0.23.2
But this issue could have existed for some time.

Root Cause Analysis

<Once determined, fill RCA here>

Open Questions:

If it's valid that concurrent goroutines need to read peerInfos as it's being written to, shouldn't it be locked inside the sync read write mutex?

cm.plk.RUnlock()
// Sort peers according to their value.
candidates.SortByValueAndStreams(true)

cm.plk.RUnlock()
candidates.SortByValueAndStreams(true)

cm.plk.RUnlock()
if ncandidates < cm.cfg.lowWater {
log.Info("open connection count above limit but too many are in the grace period")
// We have too many connections but fewer than lowWater
// connections out of the grace period.
//
// If we trimmed now, we'd kill potentially useful connections.
return nil
}
// Sort peers according to their value.
candidates.SortByValueAndStreams(false)

@p-shahi p-shahi added kind/bug A bug in existing code (including security flaws) need/maintainer-input Needs input from the current maintainer(s) need/triage Needs initial labeling and prioritization labels Oct 31, 2022
@p-shahi
Copy link
Member Author

p-shahi commented Nov 8, 2022

Some more logs shared here: ipfs/kubo#9374 (comment)

@julian88110 @MarcoPolo would either of you like to take a look?
If the Kubo team wants this in 0.17.0 (ipfs/kubo#9319 (comment)) we may need to do another patch release

@julian88110
Copy link
Contributor

taking a look.

@BigLep
Copy link
Contributor

BigLep commented Nov 8, 2022

Thanks @p-shahi being on top of this. We would like to get this in the next (0.17) Kubo release. The RC is on Thursday, 2022-11-10. A patch here could come after the RC but before the final release on 2022-11-17.

@MarcoPolo
Copy link
Contributor

Let's sync up @julian88110 I started this earlier today.

@MarcoPolo
Copy link
Contributor

Synced up. I'll take.

@p-shahi p-shahi removed need/triage Needs initial labeling and prioritization need/maintainer-input Needs input from the current maintainer(s) labels Nov 8, 2022
@p-shahi p-shahi mentioned this issue Dec 5, 2022
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
Archived in project
4 participants