Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/disk: bundles issue 4868 #4877

Conversation

srenatus
Copy link
Contributor

@srenatus srenatus commented Jul 11, 2022

馃毀 馃懛 PR for discussion purposes (#4868)

@@ -351,17 +347,20 @@ func (db *Store) Truncate(ctx context.Context, txn storage.Transaction, params s

// update symlink to point to the active db
symlink := filepath.Join(path.Dir(newDB.Opts().Dir), symlinkKey)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've included #4870 to unbreak my workflow, please disregard.

Copy link
Contributor Author

@srenatus srenatus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments inline

storage/disk/disk.go Show resolved Hide resolved
return ret
}

return []file{f} // try it anyways
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the 馃-it-might-still-work-if-we-try code path. We might still fail, but maybe we don't, because the smallSize cutoff value isn't based on any calculation, but a guess.

bundle/file.go Show resolved Hide resolved
Before, we'd failed to activate a bundle with a large data.json which
would end up being attempted to load in a single write.

Now, we'll recognize if the value is larger than a threshold, and split
it into multiple blobs, which hopefully go in with a single txn.

Activating the bundle in question works, but re-activating it afterwards
still fails in a different place: eraseBundles accumulates too many writes
in a single txn, attempting to reset the data in store.)

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@srenatus srenatus force-pushed the sr/storage/disk-bundles-issue-4868 branch 2 times, most recently from 917f5c5 to d6150b4 Compare July 12, 2022 09:22
Before, we've accumulated too-large transactions by retrieving data keys
according to their partitioning, and deleting them one-by-one.

Now, we'll do the same thing, but in an iterator passed to Truncate, which
deals with restarting transactions.

This also means that we've got an extra call to Truncate, so one Truncate
operation more than before. However, we've been having one Truncate per
bundle already. To fix this, we could combine multiple storage.Iterator into
a chaining iterator, and only do one Truncate operation, but that's a future
optimization.

Signed-off-by: Stephan Renatus <stephan.renatus@gmail.com>
@srenatus srenatus force-pushed the sr/storage/disk-bundles-issue-4868 branch from d6150b4 to 4577fad Compare July 12, 2022 09:38
@srenatus
Copy link
Contributor Author

馃挕 tests fail because the logic is wrong: calling Truncate in where it's now called will loose the transaction previously written to -- i.e. the delta bundles' patches are applied, but never written to newDB. They're an open txn for oldDB, but that doesn't matter when truncate switches DBs.

I'll redo those bits. I think I'm at the point where I'm this close to not understanding anymore how the current setup works 馃槄

@srenatus
Copy link
Contributor Author

TBC

@srenatus srenatus closed this Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant