Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Fix race condition for a read during (*dialScheduler).updateStaticPool() with a previous write on (*dialTask).resolve() #29203

Closed
wants to merge 1 commit into from

Conversation

qu0b
Copy link
Contributor

@qu0b qu0b commented Mar 8, 2024

This PR fixes a race condition found during testing with Antithesis

The full stack trace:

WARNING: DATA RACE
Read at 0x00c00053b540 by goroutine 4233:
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).updateStaticPool()
      github.com/ethereum/go-ethereum/p2p/dial.go:413 +0xa9
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).expireHistory.func1()
      github.com/ethereum/go-ethereum/p2p/dial.go:358 +0x71
  github.com/ethereum/go-ethereum/p2p.(*expHeap).expire()
      github.com/ethereum/go-ethereum/p2p/util.go:59 +0xda
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).expireHistory()
      github.com/ethereum/go-ethereum/p2p/dial.go:355 +0xa4
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).loop()
      github.com/ethereum/go-ethereum/p2p/dial.go:301 +0x1ed4
  github.com/ethereum/go-ethereum/p2p.newDialScheduler.func2()
      github.com/ethereum/go-ethereum/p2p/dial.go:181 +0x4f
Previous write at 0x00c00053b540 by goroutine 10573:
  github.com/ethereum/go-ethereum/p2p.(*dialTask).resolve()
      github.com/ethereum/go-ethereum/p2p/dial.go:517 +0x23e
  github.com/ethereum/go-ethereum/p2p.(*dialTask).run()
      github.com/ethereum/go-ethereum/p2p/dial.go:478 +0xf8
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).startDial.func1()
      github.com/ethereum/go-ethereum/p2p/dial.go:445 +0x38
Goroutine 4233 (running) created at:
  github.com/ethereum/go-ethereum/p2p.newDialScheduler()
      github.com/ethereum/go-ethereum/p2p/dial.go:181 +0x9ee
INFO [03-06|16:47:11.738] Looking for peers                        peercount=0 tried=0 static=2
  github.com/ethereum/go-ethereum/p2p.(*Server).setupDialScheduler()
      github.com/ethereum/go-ethereum/p2p/server.go:611 +0x528
  github.com/ethereum/go-ethereum/p2p.(*Server).Start()
      github.com/ethereum/go-ethereum/p2p/server.go:499 +0x8d5
  fmt.(*ss).doScanf()
      fmt/scan.go:1230 +0x40e
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x184
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:643 +0x1a
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x184
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:643 +0x1a
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  github.com/syndtr/goleveldb/leveldb.(*DB).checkAndCleanFiles()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db_util.go:52 +0x288
  github.com/syndtr/goleveldb/leveldb.openDB()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db.go:136 +0x884
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x184
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:643 +0x1a
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x264
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:657 +0x18e
  fmt.(*ss).doScanf()
      fmt/scan.go:1230 +0x40e
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x184
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:643 +0x1a
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x184
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:643 +0x1a
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  github.com/syndtr/goleveldb/leveldb.(*DB).recoverJournal()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db.go:482 +0xa6
  github.com/syndtr/goleveldb/leveldb.openDB()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db.go:131 +0x86d
  github.com/syndtr/goleveldb/leveldb.Open()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db.go:203 +0x364
  fmt.Fscanf()
      fmt/scan.go:143 +0xdc
  fmt.Sscanf()
      fmt/scan.go:114 +0x184
  github.com/syndtr/goleveldb/leveldb/storage.fsParseName()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:643 +0x1a
  github.com/syndtr/goleveldb/leveldb/storage.(*fileStorage).List()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/storage/file_storage.go:458 +0x3fd
  github.com/syndtr/goleveldb/leveldb.(*session).recover.func1()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/session.go:134 +0xbd
  runtime.deferreturn()
      runtime/panic.go:477 +0x30
  github.com/syndtr/goleveldb/leveldb.Open()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db.go:189 +0x124
  github.com/syndtr/goleveldb/leveldb.OpenFile()
      github.com/syndtr/goleveldb@v1.0.1-0.20210819022825-2ae1ddf74ef7/leveldb/db.go:225 +0x84
  github.com/ethereum/go-ethereum/p2p/enode.newPersistentDB()
      github.com/ethereum/go-ethereum/p2p/enode/nodedb.go:100 +0xcd
  github.com/ethereum/go-ethereum/p2p/enode.OpenDB()
      github.com/ethereum/go-ethereum/p2p/enode/nodedb.go:84 +0x67
  github.com/ethereum/go-ethereum/p2p.(*Server).setupLocalNode()
      github.com/ethereum/go-ethereum/p2p/server.go:516 +0x60c
  github.com/ethereum/go-ethereum/p2p.(*Server).Start()
      github.com/ethereum/go-ethereum/p2p/server.go:486 +0x864
  github.com/ethereum/go-ethereum/node.(*Node).openEndpoints()
      github.com/ethereum/go-ethereum/node/node.go:269 +0x188
  github.com/ethereum/go-ethereum/node.(*Node).Start()
      github.com/ethereum/go-ethereum/node/node.go:177 +0x139
  github.com/ethereum/go-ethereum/cmd/utils.StartNode()
      github.com/ethereum/go-ethereum/cmd/utils/cmd.go:78 +0x33
  main.startNode()
      github.com/ethereum/go-ethereum/cmd/geth/main.go:355 +0x94
  main.geth()
      github.com/ethereum/go-ethereum/cmd/geth/main.go:343 +0x15a
  github.com/urfave/cli/v2.(*Command).Run()
      github.com/urfave/cli/v2@v2.25.7/command.go:274 +0x13f4
  github.com/urfave/cli/v2.(*App).RunContext()
      github.com/urfave/cli/v2@v2.25.7/app.go:332 +0x1249
  github.com/urfave/cli/v2.(*App).Run()
      github.com/urfave/cli/v2@v2.25.7/app.go:309 +0x71
  main.main()
      github.com/ethereum/go-ethereum/cmd/geth/main.go:270 +0x1c
Goroutine 10573 (running) created at:
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).startDial()
      github.com/ethereum/go-ethereum/p2p/dial.go:444 +0x64a
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).startStaticDials()
      github.com/ethereum/go-ethereum/p2p/dial.go:404 +0x164
  github.com/ethereum/go-ethereum/p2p.(*dialScheduler).loop()
      github.com/ethereum/go-ethereum/p2p/dial.go:233 +0x164
  github.com/ethereum/go-ethereum/p2p.newDialScheduler.func2()
      github.com/ethereum/go-ethereum/p2p/dial.go:181 +0x4f
==================

@qu0b qu0b requested review from fjl and zsfelfoldi as code owners March 8, 2024 18:30
@karalabe
Copy link
Member

Thanks for reporting the race. The PR seems a bit handwavy, just adding a random lock is most probably not how this is supposed to be fixed. My guess is that there's a call order that's not supposed to happen and that needs to be fixed.

@karalabe
Copy link
Member

CC @fjl

@fjl
Copy link
Contributor

fjl commented Mar 11, 2024

This race was previously reported here: #23430
The cause of the issue, as described there, is that we need access to the node ID from the dialer mainloop goroutine, but the ID is stored within the dest node object, which can be updated.

@fjl
Copy link
Contributor

fjl commented Mar 11, 2024

Looking again, the issue is more complex, because we actually do want to some other node fields from the scheduler loop. So I now think a better fix would be using atomic.Pointer.

@qu0b
Copy link
Contributor Author

qu0b commented Mar 11, 2024

Should the atomic.Pointer be used only for the static peer mapping as the issue has only been seen there?

@qu0b
Copy link
Contributor Author

qu0b commented Mar 11, 2024

That would require an atomic pointer for every Enode

@fjl
Copy link
Contributor

fjl commented Mar 12, 2024

Recreated as #29235 because I can't push to this PR for some reason.

@fjl fjl closed this Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants