Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: remove legacy orphans background #863

Closed
wants to merge 2 commits into from

Conversation

cool-develope
Copy link
Collaborator

@cool-develope cool-develope commented Dec 22, 2023

Closes: #862

Context

  • refactor removing the legacy orphans in background.
  • add the mutex in BatchWithFlusher to avoid the race condition

@cool-develope cool-develope requested a review from a team as a code owner December 22, 2023 20:32
Copy link

coderabbitai bot commented Dec 22, 2023

Walkthrough

The updates involve refactoring the pruning logic of a Go application's V1. The changes focus on improving the handling of orphan nodes by introducing a new method to delete them and altering the existing pruning function to work asynchronously. This approach aims to enhance performance by preventing the main thread I/O from being blocked, and it also adds the flexibility to manually trigger the pruning process via an exposed API.

Changes

File(s) Summary of Changes
nodedb.go - Added time package import
- Added deleteOrphans method
- Modified deleteLegacyVersions to use a goroutine and invoke deleteOrphans
batch.go - Introduced sync.Mutex field mtx in the BatchWithFlusher struct
- Modified Set and Delete methods to acquire and release the mutex for thread safety

Assessment against linked issues

Objective Addressed Explanation
Refactor the pruning logic to handle orphan nodes more efficiently and allow manual triggering (Issue #862) The addition of deleteOrphans and the modification of deleteLegacyVersions to work asynchronously with a goroutine align with the objective of improving efficiency and allowing manual operation.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • You can directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
  • You can tag CodeRabbit on specific lines of code or entire files in the PR by tagging @coderabbitai in a comment. Examples:
  • You can tag @coderabbitai in a PR comment and ask questions about the PR and the codebase. Use quoted replies to pass the context for follow-up questions. Examples:
    • @coderabbitai render interesting statistics about this repository as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai generate unit tests for the src/utils.ts file.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

Comment on lines +474 to +478
go func() {
if err := ndb.deleteOrphans(); err != nil {
ndb.logger.Error("failed to clean legacy orphans", "err", err)
}
}()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of a goroutine to call deleteOrphans is a good approach to avoid blocking the main thread. However, error handling within the goroutine should be improved. Currently, if an error occurs, it is only logged. Consider adding a mechanism to handle these errors more robustly, such as a channel to communicate errors back to the main thread or a callback.

Comment on lines +491 to +502
count := 0
for ; itr.Valid(); itr.Next() {
if err := ndb.batch.Delete(itr.Key()); err != nil {
return err
}

// Sleep for a while to avoid blocking the main thread i/o.
count++
if count > 1000 {
count = 0
time.Sleep(100 * time.Millisecond)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sleep mechanism in deleteOrphans is implemented to avoid blocking the main thread I/O by sleeping after every 1000 deletions. This is a simple rate-limiting strategy, but it may not be optimal. Consider using a more dynamic approach, such as monitoring the system's load or I/O wait time, to adjust the sleep duration or the number of operations between sleeps.

@cool-develope cool-develope changed the title remove legacy orphans background feat: remove legacy orphans background Dec 22, 2023
@ValarDragon
Copy link
Contributor

ValarDragon commented Dec 22, 2023

Is this concurrency safe with main thread commits? (This can add to deleting while the main thread is committing right?)

Do we need a mutex to ensure the two don't try to add to batch at the same time?

@cool-develope
Copy link
Collaborator Author

Is this concurrency safe with main thread commits? (This can add to deleting while the main thread is committing right?)

Do we need a mutex to ensure the two don't try to add to batch at the same time?

yeah, the test is failed with data racing

Comment on lines 49 to 57
// the batch is flushed to disk, cleared, and a new one is created with buffer pre-allocated to threshold.
// The addition entry is then added to the batch.
func (b *BatchWithFlusher) Set(key, value []byte) error {
b.mtx.Lock()
defer b.mtx.Unlock()

batchSizeAfter, err := b.estimateSizeAfterSetting(key, value)
if err != nil {
return err
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [49-81]

Consider protecting Write, WriteSync, Close, and GetByteSize methods with the mutex to prevent potential race conditions.

@ValarDragon
Copy link
Contributor

This is the wrong spot to put it, we hacked something that worked in this branch https://github.com/cosmos/iavl/tree/dev/v1.0.0-orphan

(the first part of the loop is what was expensive)

I think the way you were doing the code is cleaner, just that was something that I got to work. Plz feel free to takeover/move anything that helps 🙏

@cool-develope
Copy link
Collaborator Author

close in favor of #877

@tac0turtle tac0turtle deleted the fix/orphan_removing branch February 15, 2024 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor the pruning logic in V1
2 participants