Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race in the delegating logger #1359

Closed
charith-elastic opened this issue Jan 25, 2021 · 4 comments · Fixed by #1361
Closed

Data race in the delegating logger #1359

charith-elastic opened this issue Jan 25, 2021 · 4 comments · Fixed by #1361
Assignees

Comments

@charith-elastic
Copy link
Contributor

In DelegatingLogger, I think the Fulfil method needs to be protected by a lock as well. After upgrading to 0.8.1, our test suite reliably fails when run with the Go race detector. After I added locking code to the Fulfil method, the race detector became happy again.

I can raise a PR for the change but I am struggling to produce a simple test case that reliably triggers the issue.

@alvaroaleman
Copy link
Member

Do you have a sample stacktrace for this race?

@charith-elastic
Copy link
Contributor Author

WARNING: DATA RACE
Read at 0x00c0001820c8 by goroutine 9:
  sigs.k8s.io/controller-runtime/pkg/log.(*DelegatingLogger).V()
      <autogenerated>:1 +0x54
  .../pkg/controller/common/watches.(*DynamicEnqueueRequest).AddHandler()
      .../pkg/controller/common/watches/handler.go:72 +0x33d

Previous write at 0x00c0001820c8 by goroutine 8:
  sigs.k8s.io/controller-runtime/pkg/log.(*loggerPromise).Fulfill()
      /home/cell/code/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.1/pkg/log/deleg.go:76 +0x11a
  sigs.k8s.io/controller-runtime/pkg/log.(*loggerPromise).Fulfill()
      /home/cell/code/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.1/pkg/log/deleg.go:81 +0x22f
  sigs.k8s.io/controller-runtime/pkg/log.(*DelegatingLogger).Fulfill()
      /home/cell/code/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.1/pkg/log/deleg.go:133 +0x12e
  sigs.k8s.io/controller-runtime/pkg/log.init.0.func1()
      /home/cell/code/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.1/pkg/log/log.go:67 +0xbb

Goroutine 9 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1168 +0x5bb
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1439 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1123 +0x202
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1437 +0x612
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1345 +0x3b3
  main.main()
      _testmain.go:77 +0x236

Goroutine 8 (finished) created at:
  sigs.k8s.io/controller-runtime/pkg/log.init.0()
      /home/cell/code/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.1/pkg/log/log.go:62 +0x44

Adding a lock to Fulfill, as I suggested, causes all controller-runtime tests to fail with deadlock. Because it's a recursive data structure, I am struggling to reason about it. It could be that the way we use the log package in our code is wrong. I'll keep investigating.

@charith-elastic
Copy link
Contributor Author

I think it is because of the init function that injects a null logger after 30 seconds.

func init() {
// Init is blocking, so start a new goroutine
go func() {
time.Sleep(30 * time.Second)
loggerWasSetLock.Lock()
defer loggerWasSetLock.Unlock()
if !loggerWasSet {
Log.Fulfill(NullLogger{})
}
}()
}

We don't call log.SetLogger in our tests. So, after 30 seconds the logger is replaced by the goroutine above. The methods of the logr.Logger embedded into DelegatingLogger (such as Info(...)) are not protected by a lock. So the race detector correctly sees the logger being written to and read from at the same time by different goroutines.

I can't see a good solution for this. It feels like a tradeoff between extra memory usage vs. thread safety.

@alvaroaleman
Copy link
Member

Well, I added locking around the logger replacing that is done via Fullfill in #1339, that is why I am confused as to why this still happens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants