Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ApplyBucketWork at per-bucket level #4314

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ThomasBrady
Copy link
Contributor

@ThomasBrady ThomasBrady commented May 14, 2024

Description

Refactors ApplyBucketsWork to apply buckets per-bucket rather than per-level. Deleted a lot of book keeping code referring to curr/snap apply ordering.

Issue: #4290

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@ThomasBrady ThomasBrady requested a review from SirTyson May 14, 2024 17:34
{
startLevel();
advance(isCurr ? "curr" : "snap", applicator);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we still want log messages to print snap / curr to show more granular progress? If not I can remove this logic as well

Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this doesn't work because advance is not being called properly. The way the work classes are designed, we do long running jobs in chunks to not hog resources. The expectation is that doWork() will be called periodically, do a little work, then return quickly.

BucketApplicator assumes this interface. If you look at BucketApplicator::advance, you'll see that it only applies LEDGER_ENTRY_BATCH_COMMIT_SIZE entries then exits. This means that, especially for large buckets, BucketApplicator::advance must be called many times before the Bucket finishes applying. The issue is this code only calls BucketApplicator::advance a single time before moving on to the next bucket.

If you look at the old ApplyBucketsWork, we maintained mFirstBucketApplicator and mSecondBucketApplicator pointers to persist Applicators across calls to doWork. We only move on to the next bucket when *mFirstBucketApplicator is false (note that this is pointer dereference, not checking if the pointer itself is null), indicating the applicator has reached EOF. You need to implement similar logic, where you maintain an applicator pointer and only advance to the next bucket when the applicator is false. Currently you check if applicator is true once then call apply, but you always advance to the next bucket unconditional, even if after the first call to advance applicator is still true.

For changes like this, in the future I'd recommend running a watcher node and catchup to the network in addition to the unit tests.

@ThomasBrady ThomasBrady requested a review from SirTyson May 15, 2024 23:21
Copy link
Contributor

@SirTyson SirTyson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me! Just a few nit pics. There's still some complexity involving mLevel and isCurr I'd like to see get factored out, but unfortunately it looks like this may be required given how we check invariants :(

std::unordered_set<LedgerKey> mSeenKeys;
std::vector<std::shared_ptr<Bucket>> mBucketsToIndex;
std::vector<std::shared_ptr<Bucket>> mBucketsToApply;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need both mBucketsToIndex and mBucketsToApply. These both hold the same contents and never change, so we should be able to condense them.

std::unordered_set<LedgerKey> mSeenKeys;
std::vector<std::shared_ptr<Bucket>> mBucketsToIndex;
std::vector<std::shared_ptr<Bucket>> mBucketsToApply;
std::unique_ptr<BucketApplicator> mBucketApplicator;
std::shared_ptr<Bucket> mBucket;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mBucket is not necessary. We maintain pointer references in mBucketsToApply already so there's no risk of dropping memory, and we maintain mBucketsToApplyIndex. We have like 6 different bucket related members floating around here, so reducing those that aren't needed would be helpful.

std::vector<std::shared_ptr<Bucket>> mBucketsToApply;
std::unique_ptr<BucketApplicator> mBucketApplicator;
std::shared_ptr<Bucket> mBucket;
std::size_t mBucketsToApplyIndex{0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: If it compiles (which I think it should), we generally don't use std:: prefix on primitive types. If it breaks though definitely leave as is.

@@ -133,9 +107,12 @@ ApplyBucketsWork::doReset()
mAppliedSize = 0;
mLastAppliedSizeMb = 0;
mLastPos = 0;
mBucketsToApplyIndex = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Rename to mBucketToApplyIndex since we're only applying a singluar bucket at a time.

@ThomasBrady ThomasBrady force-pushed the 4290-refactor-apply-buckets-work branch from e859bda to ba9f43f Compare May 17, 2024 20:39
@SirTyson
Copy link
Contributor

SirTyson commented Jun 3, 2024

r+ ba9f43f

latobarita added a commit that referenced this pull request Jun 3, 2024
…-work

ApplyBucketWork at per-bucket level

Reviewed-by: SirTyson
@ThomasBrady ThomasBrady force-pushed the 4290-refactor-apply-buckets-work branch from ba9f43f to 545b721 Compare June 3, 2024 18:39
@SirTyson
Copy link
Contributor

SirTyson commented Jun 4, 2024

Trying again

r+ 545b721

@graydon
Copy link
Contributor

graydon commented Jun 4, 2024

@latobarita: retry

@graydon
Copy link
Contributor

graydon commented Jun 4, 2024

r+ 545b721

latobarita added a commit that referenced this pull request Jun 4, 2024
…-work

ApplyBucketWork at per-bucket level

Reviewed-by: graydon
latobarita added a commit that referenced this pull request Jun 4, 2024
…-work

ApplyBucketWork at per-bucket level

Reviewed-by: graydon
@ThomasBrady ThomasBrady force-pushed the 4290-refactor-apply-buckets-work branch from 545b721 to 2b4cb4a Compare June 4, 2024 23:52
@ThomasBrady
Copy link
Contributor Author

@latobarita: retry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants