Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateless witness prefetcher changes #29519

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

karalabe
Copy link
Member

Superseeds #29035 because OP didn't permit modifications from maintainers...

case ch := <-sf.copy:
// Somebody wants a copy of the current trie, grant them
ch <- sf.db.CopyTrie(sf.trie)

case <-sf.stop:
// Termination is requested, abort and leave remaining tasks
// Termination is requested, abort
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the comment is nice, but the code doesn't reflect it :P

The code should check if sf.tasks is nil or not and if not, should keep looping until it becomes so, otherwise we run the risk of receiving a last task and immediately closing down; the close being executed first (remember, select branch evaluation is non-deterministic if multiple channels are ready).

func (p *triePrefetcher) used(owner common.Hash, root common.Hash, used [][]byte) {
if p.closed {
return
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are IMO not good changes. It makes things harder to reason about and the close becomes this magic thing that nukes the prefercher offline, but I'm not sure that's the intended case, since close is also teh thing that waits for data to be finished. So we need to figure out what close does: kill it, or wait on it.

return nil
}
return sf.db.CopyTrie(sf.trie)
return nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see the point of this change, it makes peek useless after close, but close it teh thing that waits for all the data to be loaded, so it's kind of ... weird

// abort interrupts the subfetcher immediately. It is safe to call abort multiple
// times but it is not thread safe.
func (sf *subfetcher) abort() {
// close waits for the subfetcher to finish its tasks. It cannot be called multiple times
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it can be called multiple times. Close might also not be the best name since we're waiting for it to finish but should AFAIK not kill the thing.

}

case ch := <-sf.copy:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about this code path here with the rewrite. Do we want to allow retrieval from a live prefetcher? If yes, why? Perhaps for tx boundaries? We should really document it somewhere why - if - it's needed. It's a very specific use case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we call updateTrie on a state object, we attempt to source the trie from the prefetcher. So copying from a live prefetcher is used here to preserve that functionality.

Copy link
Member Author

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one issue that teh PR does not address but it must is what the new lifecycle of the prefetcher is. Previously it was jsut something we threw data at, and then at some point we aborted it and pulled every useful data it [re-loaded and built our stuff on top.

The new logic seems to push it towards a witness where we wait for all data to be loaded before pulling and operating on it. But the code doesn't seem to reflect that, many paths instead becomming dud after a close.

Either this PR is only half the code needed that actually uses the prefetcher as is, or something's kind of borked. Either way, we must define what the intended behavior is and both document it as well as make sure teh prefetcher adheres to it.

I'm kind of wondering whether close is needed, rather we should have a wait method which perhaps just ensure everything is loaded. Whether we're between txs or at block end, waiting for prefetching to finish makes sense. I guess close might be needed to nuke out the loop goroutine, but we should still have a wait then before peeking at stuff. Ah, I guess the "implicit" behavioral thing this PR is aiming for is that the prefetcher is not thread safe so by the time qwe wall peek, any shceduled data is already prefetched. I don't think that's the case, at least it's a dangerous data-race to assume that events fired on 2 different channels will arrive in the exact order one expects. If this is the inteded behavior, I'd rather make it ever so slightly more explicit that hoping for a good order of events.

@holiman
Copy link
Contributor

holiman commented Apr 15, 2024

I'm kind of wondering whether close is needed, rather we should have a wait method which perhaps just ensure everything is loaded

As I see it, the prefetcher needs a couple of phases.

  1. Phase 1: open for scheduling. At this point, it accepts tasks to be fetched. Callers must not (cannot?) retrieve data from it at this point. When an external caller tells it to, it goes into
  2. Phase 2: No longer open for scheduling tasks. At this point, finishes all tasks, and once all tasks are done, it goes into
  3. Phase 3: (again, not open for scheduling tasks) At this point, callers can retrieve data from it.

Perhaps we need something more elaborate than this, but, whatever we need, we would be well served by first jotting down the description in human language; before doing some lock/mutex/channel-based implementation of "something"

@karalabe
Copy link
Member Author

As I understand the difference between the old and new prefetcher is (should be) as follows:

  • Old pre-fetcher:

    • Purpose is to warm up the trie during execution, so that while we're crunching some EVM code, our disk is kept busy pulling in data that the hasher will need at the end.
    • All operations are async, running in the background, the most important thing is to never ever block. If we get more useful data great, if less, thats life, but we should never hold up execution.
    • When execution reaches a boundary (IntermediateRoot pre-Byzantium; or Finalize after Byzantium), insta-terminate all pre-fetchers to avoid the main committer thread from racing for disk accesses. Whatever we managed to load will be used, the rest pulled on demand.
  • New pre-fetcher:

    • Purpose is to act as a witness constructor (write only for now) during execution, so that while we're crunching some EVM code, our disk is kept busy pulling in data that both the hasher hasher, but also a cross-validator will need at the end.
    • Almost all operations are async, running in the background, the most important thing is to never ever block during EVM execution. However, on commit boundaries we have to switch to blocking mode, since the witness needs all data, not just whatever we loaded until that point in time.
    • When execution reaches a boundary (IntermediateRoot pre-Byzantium; or Finalize after Byzantium), wait for all pre-fetchers to finish. This will block the main committer thread, but ideally if we're not loading junk, it should be all the same, the data needs to be loaded anyway to commit. For the witness, the data must be loaded, before tries are mutated.
  • (Threading) Quirks:

    • Pre-Byzantium does an IntermediateRoot call between each transaction. A witness pre-fetcher for that block range must support stopping after a transaction, collecting the witness; then continuing against the next transaction, collecting witnesses from updated tries. This is significantly more complex from both a witness and a threading perspective, to have data across tries. Given that pre-Byzantium is ancient, it doesn't make sense to support it, but we need to very explicitly handle / reject that case, otherwise it's going to be "weird" trying to understand the code.
    • The old pre-fetcher was best-effort, with no guarantees on correctness (as to how much and what data it loaded). The new pre-fetcher needs to be correct to construct a proper witness, so sometimes blocking is necessary. That however means that code paths need to be re-thought, as we still want to maximise the main EVM execution pathways even whilst waiting for data. Particularly, when terminating a pre-fetcher (i.e. waiting), we should start integrating results from finished subfetchers before waiting for all storage tries to finish loading.
  • Qustions:

    • Does slot mutation order make the witness different? I.e. If i change 3 slots in a contract (including delete/create), does the order of applying them change what trie nodes we need? Because if so, there might be a hidden step still needed during commit to add prefetch tasks (?)

@jwasinger
Copy link
Contributor

jwasinger commented Apr 15, 2024

Purpose is to act as a witness constructor (write only for now) during execution, so that while we're crunching some EVM code, our disk is kept busy pulling in data that both the hasher hasher, but also a cross-validator will need at the end.

It's actually meant to gather witnesses for read values. In the stateless witness builder PR, I gather write witnesses from committing the tries.

But iirc, earlier on the call today you mentioned not tying the retrieval of write witnesses to the commit operation, which would change the assumptions from my original code.

@karalabe karalabe force-pushed the stateless-witness-prefetcher-changes branch from 139448f to f5ec2e7 Compare May 3, 2024 10:34
@karalabe
Copy link
Member Author

karalabe commented May 7, 2024

Screenshot 2024-05-07 at 09 22 54

// if a prefetcher is available. This path is used if snapshots are unavailable,
// since that requires reading the trie *during* execution, when the prefetchers
// cannot yet return data.
func (s *stateObject) getTrie(skipPrefetcher bool) (Trie, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, skipPrefetcher is kind of an ugly hack, I just wanted to avoid the lack-of-snapshot poking into the prefetcher. Open to cleaner suggestions.

@karalabe karalabe added this to the 1.14.2 milestone May 7, 2024
@@ -197,7 +216,7 @@ func (s *stateObject) GetCommittedState(key common.Hash) common.Hash {
// If the snapshot is unavailable or reading from it fails, load from the database.
if s.db.snap == nil || err != nil {
start := time.Now()
tr, err := s.getTrie()
tr, err := s.getTrie(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably problematic?

  • Prefetcher is enabled if the associated snapshot layer is available
  • Retrievals from snapshot can succeed if the requested item is covered
  • Retrievals from snapshot can fail if the requested item is not covered yet(under the generation)
  • The scenario is possible a few data retrieval through snapshot succeed and some prefetching tasks have been scheduled; then a retrieval fails and the prefetch'd trie is abandoned

The witness of the trie could be incomplete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants