Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix raft tls key rotation panic when rotation time in past #15156

Merged
merged 6 commits into from Apr 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions changelog/15156.txt
@@ -0,0 +1,3 @@
```release-note:bug
rafft: fix Raft TLS key rotation panic that occurs if active key is more than 24 hours old
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: rafft

```
33 changes: 20 additions & 13 deletions vault/raft.go
Expand Up @@ -473,27 +473,33 @@ func (c *Core) raftTLSRotatePhased(ctx context.Context, logger hclog.Logger, raf
return errors.New("no active raft TLS key found")
}

getNextRotationTime := func(next time.Time) time.Time {
now := time.Now()

// active key's CreatedTime + raftTLSRotationPeriod might be in
// the past (meaning it is ready to be rotated) which will cause
// NewTicker to panic when used with time.Until, prevent this by
// pushing out rotation time into very near future
if next.Before(now) {
return now.Add(1 * time.Minute)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, though I think it's not quite what the behaviour was before, right? Prior to the breaking change, I assume a negative next would be treated as "immediately", not "now+1m"?

Copy link
Contributor Author

@ccapurso ccapurso Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true but with NewTicker I think all you can really do is try to be as close to "immediately" as possible. The reason for adding 1m was just to try to add enough buffer to prevent the negative NewTicker issue when using time.Until. It needs to be some time in the future in other words.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reduced the padding to be 10 seconds. I'll admit that the 1 minute was a bit aggressive.

}

// push out to ensure proposed time does not elapse
return next.Add(10 * time.Second)
}

// Start the process in a go routine
go func() {
nextRotationTime := activeKey.CreatedTime.Add(raftTLSRotationPeriod)
nextRotationTime := getNextRotationTime(activeKey.CreatedTime.Add(raftTLSRotationPeriod))

keyCheckInterval := time.NewTicker(1 * time.Minute)
defer keyCheckInterval.Stop()

var backoff bool
// ticker is used to prevent memory leak of using time.After in
// for - select pattern.
ticker := time.NewTicker(time.Until(nextRotationTime))
defer ticker.Stop()
for {
// If we encountered and error we should try to create the key
// again.
if backoff {
nextRotationTime = time.Now().Add(10 * time.Second)
backoff = false
}

ticker.Reset(time.Until(nextRotationTime))
select {
case <-keyCheckInterval.C:
err := checkCommitted()
Expand All @@ -505,11 +511,12 @@ func (c *Core) raftTLSRotatePhased(ctx context.Context, logger hclog.Logger, raf
next, err := rotateKeyring()
if err != nil {
logger.Error("failed to rotate TLS key", "error", err)
backoff = true
continue
nextRotationTime = time.Now().Add(10 * time.Second)
} else {
nextRotationTime = getNextRotationTime(next)
}
ncabatoff marked this conversation as resolved.
Show resolved Hide resolved

nextRotationTime = next
ticker.Reset(time.Until(nextRotationTime))

case <-stopCh:
return
Expand Down