New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup and simplify lock usage in database plugin #15944
Conversation
Following up from discussions in #15923 and #15933, I wanted to split out a separate PR that drastically reduced the complexity of the use of the databaseBackend lock. We no longer need it at all for the `credRotationQueue`, and we can move it to be solely used in a few, small connections map management functions.
db.Close() | ||
connections := b.connClear() | ||
for _, db := range connections { | ||
go db.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't help but slip this in :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of misgivings but overall I think this is a lot clearer than what it replaces.
database: dbw, | ||
name: name, | ||
id: id, | ||
}) | ||
if oldConn != nil { | ||
oldConn.Close() | ||
} | ||
|
||
err = storeConfig(ctx, req.Storage, name, config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we breaking any assumptions anywhere by calling storeConfig without a lock now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, no. I don't think there is anything unsafe about multiple calls to storeConfig
, other than only one of them would win (which is also true even with a lock).
builtin/logical/database/backend.go
Outdated
delete(b.connections, db.name) | ||
mapDB := b.connPop(db.name) | ||
if mapDB != nil && db.id != mapDB.id { | ||
// oops, put it back |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not confident this is safe. Can you add another pop method that takes an id as well so we don't have this dance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that. I didn't love this either, but I was trying to preserve the original intent.
credRotationQueue *queue.PriorityQueue | ||
cancelQueue context.CancelFunc | ||
// queueCtx is the context for the priority queue | ||
queueCtx context.Context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If you generate the queue context in the same function where it's consumed, then you only really need to store the Done()
channel (and the cancel func) on the struct. That will help stop this context getting abused for other uses by accident in future too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I think I prefer to keep the context rather than the Done()
channel. I don't think there's as big a danger in abusing the context as there was in abusing the lock, since all the context (or Done()
) channel can really do is be canceled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few follow-up comments, but otherwise looks good :)
builtin/logical/database/backend.go
Outdated
func (b *databaseBackend) connPopIfEqual(name, id string) *dbPluginInstance { | ||
b.connLock.Lock() | ||
defer b.connLock.Unlock() | ||
dbi := b.connections[name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to do an ok
check here or dbi.id
could panic if that doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Calvin Leung Huang <1883212+calvn@users.noreply.github.com>
Thanks! |
This is a replacement for #15923 that takes into account recent lock cleanup. I went ahead and added back in the hanging plugin test, which I meant to add in #15944 but forgot. I tested this by spinning up a statsd sink in the tests and verifying I got a stream of metrics: ``` $ nc -u -l 8125 | grep backend test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:1.000000|g test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:0.000000|g test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:1.000000|g test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:0.000000|g ```
Add database plugin metrics around connections This is a replacement for #15923 that takes into account recent lock cleanup. I went ahead and added back in the hanging plugin test, which I meant to add in #15944 but forgot. I tested this by spinning up a statsd sink in the tests and verifying I got a stream of metrics: ``` $ nc -u -l 8125 | grep backend test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:1.000000|g test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:0.000000|g test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:1.000000|g test.swenson-Q9Q0L72D39.secrets.database.backend.connections.count.pgx.5.:0.000000|g ``` We have to rework the shared gauge code to work without a full `ClusterMetricSink`, since we don't have access to the core metrics from within a plugin. This only reports metrics every 10 minutes by default, but it solves some problems we would have had with the gauge values becoming stale and needing to be re-sent. Co-authored-by: Tom Proctor <tomhjp@users.noreply.github.com>
Following up from discussions in #15923 and #15933, I wanted to split
out a separate PR that drastically reduced the complexity of the use of
the databaseBackend lock. We no longer need it at all for the
credRotationQueue
, and we can move it to be solely used in a few,small connections map management functions.