-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Optimize finding newer active deployments for device using MongoDB Aggregation #922
base: master
Are you sure you want to change the base?
Conversation
…oDB aggregation This commit optimizes a database query using the MongoDB aggregation framework to improve performance and reduce resource usage and network data transfer. The changes include a more efficient aggregation pipeline to fetch data, resulting in faster query execution and less data transferred overall. Changelog: All Ticket: None
Merging these commits will result in the following changelog entries: Changelogsdeployments (optimize-find-active-deployments-using-aggregation)New changes in deployments since master: Features
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @BenMeehan! 👋
Thank you for the contribution, and apologies for the long delay.
This is certainly an overlooked performance optimization, so thanks for taking your time to fix it.
There are some minor errors in the way the empty documents results are handled, I added some suggestions for fixing and also simplifying the implementation a little bit.
To make the CI happy, you also need to sign off your commit, simply run git commit --amend --sign-off
and (force-) push the updated commit message.
Once you've amended the commits I will take care of adjusting the tests.
Cheers! 🍻
@@ -42,6 +42,7 @@ import ( | |||
"github.com/mendersoftware/deployments/store" | |||
"github.com/mendersoftware/deployments/store/mongo" | |||
"github.com/mendersoftware/deployments/utils" | |||
mdriver "go.mongodb.org/mongo-driver/mongo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an (internal) developer guideline, the application layer should not import the database driver of the database layer (store
).
mdriver "go.mongodb.org/mongo-driver/mongo" |
if len(deployments) == 0 { | ||
deployment, err := d.db.FindOldestActiveDeploymentForDevice(ctx, deviceID, lastDeployment) | ||
if err != nil { | ||
if err == mdriver.ErrNoDocuments { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error should be checked for inside the mongo
package. Also, mdriver.Collection.Aggregate
will never return this error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See suggestion below.
// use an aggregation pipeline to fetch only the oldest active deployment | ||
pipeline := []bson.M{ | ||
{ | ||
"$match": bson.M{ | ||
StorageKeyDeploymentCreated: bson.M{"$gt": createdAfter}, | ||
StorageKeyDeploymentActive: true, | ||
StorageKeyDeploymentDeviceList: bson.M{"$in": []string{deviceID}}, | ||
}, | ||
}, | ||
{ | ||
"$sort": bson.M{StorageKeyDeploymentCreated: 1}, | ||
}, | ||
{ | ||
"$limit": 1, | ||
}, | ||
} | ||
|
||
var deployment *model.Deployment | ||
|
||
cursor, err := c.Aggregate(ctx, pipeline) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "failed to get deployments") | ||
} | ||
defer cursor.Close(ctx) | ||
|
||
for cursor.Next(ctx) { | ||
if err := cursor.Decode(&deployment); err != nil { | ||
return nil, errors.Wrap(err, "failed to get deployments") | ||
} | ||
} | ||
|
||
if err := cursor.Err(); err != nil { | ||
return nil, errors.Wrap(err, "failed to get deployments") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested simplification (requires import "fmt"
):
// use an aggregation pipeline to fetch only the oldest active deployment | |
pipeline := []bson.M{ | |
{ | |
"$match": bson.M{ | |
StorageKeyDeploymentCreated: bson.M{"$gt": createdAfter}, | |
StorageKeyDeploymentActive: true, | |
StorageKeyDeploymentDeviceList: bson.M{"$in": []string{deviceID}}, | |
}, | |
}, | |
{ | |
"$sort": bson.M{StorageKeyDeploymentCreated: 1}, | |
}, | |
{ | |
"$limit": 1, | |
}, | |
} | |
var deployment *model.Deployment | |
cursor, err := c.Aggregate(ctx, pipeline) | |
if err != nil { | |
return nil, errors.Wrap(err, "failed to get deployments") | |
} | |
defer cursor.Close(ctx) | |
for cursor.Next(ctx) { | |
if err := cursor.Decode(&deployment); err != nil { | |
return nil, errors.Wrap(err, "failed to get deployments") | |
} | |
} | |
if err := cursor.Err(); err != nil { | |
return nil, errors.Wrap(err, "failed to get deployments") | |
} | |
var deployment *model.Deployment | |
err := c.FindOne(ctx, bson.M{ | |
StorageKeyDeploymentCreated: bson.M{"$gt": createdAfter}, | |
StorageKeyDeploymentActive: true, | |
StorageKeyDeploymentDeviceList: bson.M{"$in": []string{deviceID}}, | |
}, mopts.FindOne().SetSort(bson.M{StorageKeyDeploymentCreated: 1})). | |
Decode(&deployment) | |
if errors.Is(err, mongo.ErrNoDocuments) { | |
return nil, nil | |
} else if err != nil { | |
return nil, fmt.Errorf( | |
"failed to lookup latest deployment for device %q: %w", | |
deviceID, err, | |
) | |
} |
It's important that the mongo.ErrNoDocuments
is caught correctly.
return nil, nil, nil | ||
} | ||
return nil, nil, errors.Wrap(err, | ||
"Failed to search for newer active deployments") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following from suggestion above.
} | |
} else if deployment == nil { | |
return nil, nil, nil | |
} |
Ohh, and one more thing! We use semantic commits to track and group changelogs, so could you please also update the commit message to use the Suggestionperf: Optimize query for finding next deployment for device Cheers! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Optimize finding newer active deployments for device using MongoDB aggregation
Noticed a lot of data egress when a device checks for an update, especially if the deployment size is large.
This commit optimizes a database query using the MongoDB aggregation framework to improve performance and reduce resource usage and network data transfer. The changes include a more efficient aggregation pipeline to fetch data, resulting in faster query execution and less data transferred overall.
Changelog: All
Ticket: None