Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth/tracers: avoid unsyncronized mutations on trie database #23632

Merged
merged 2 commits into from Sep 28, 2021

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Sep 23, 2021

This might fix #23559.
The current code has several goroutine groups, for different purposes.

  1. Executors which trace chains. Even if those are disabled (replaced with a random timout), the error(s) still happen. So they're not the problem.
  2. One state-maker loop, which readies the state for the executors to trace on. This one also references/dereferences state.
  3. One result-catcher loop. This one spews data to the caller, and also dereferences nodes.

I think the root cause is the fact that a statedb.Commit, which is done on 2 inside the StateAt call writes to the trie.Database. While doing the actual insert, it does hold the trie mutex, but I suspect that the locking is too granular: if we are in the middle of a large commit when all of a sudden a dereference happens, then things might become br0ken.

So this PR moves the final dereferencing into 2, so all Reference/Dereference/Commit operations are done from one routine only.

With this fix, I haven't been able to repro the issues any more.

Further testing/verification would be appreciated, to find out if it's a correct fix or not.

@rjl493456442
Copy link
Member

rjl493456442 commented Sep 24, 2021

if we are in the middle of a large commit when all of a sudden a dereference happens, then things might become br0ken.

But the dereference operation should be blocked until the lock is released. And the delayed dereference shouldn't lead to missing trie node error.

EDIT: It indeed has data race. The issue is valid.

@holiman
Copy link
Contributor Author

holiman commented Sep 24, 2021

But the dereference operation should be blocked until the lock is released.

The Commit doesn't hold the lock, it's aqcuired only when it needs to do an insert or add a storage root reference.

Take a look at https://github.com/ethereum/go-ethereum/blob/master/trie/database.go#L325 . When we do the commit, and do an insert of a node n.

  • If the node is already there, we do nothing else.
  • Later on during the commit operation, we will add a new parent p (at least the state root)
  • which will increase the parent count of it's children (including the parent count of node n). But before we've done so, the parent count is untouched. Therefore, it's ripe for removal if a previous parent is dereffed
state 1
              R1
            /  \
           B     C
          /       \
         leaf1   leaf2
         
state 2:
         R2
         |
         C2
         |
         leaf2
         
state 3:

             R3
            /  \
           B     C
          /       \
         leaf1   leaf2

While committing `R3`, we do

insert(leaf1)
insert(b)
// Now, deref R1 happens
  - remove leaf1
  - remove B
  - remove R1
insert R3

So IIUC, it's only when we insert a new node that we recurse down the tree and update the refcounts. And until we've added a new node (e.g. the stateroot), the internal state (parent counts) of the db is kind of inconsistent.

@@ -337,10 +349,11 @@ func (api *API) traceChain(ctx context.Context, start, end *types.Block, config
}
// Prepare the statedb for tracing. Don't use the live database for
// tracing to avoid persisting state junks into the database.
statedb, err = api.backend.StateAtBlock(localctx, block, reexec, statedb, false)
if err != nil {
if _statedb, err := api.backend.StateAtBlock(localctx, block, reexec, statedb, false); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps rename the _statedb to s?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, is it necessary? if error is returned we will exit the for loop anyway. The statedb will not be used even it's set to nil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it isn't necessary. While investigating, I did use the statedb for outputting some more info, but that code is gone now, so I can revert that change

Copy link
Member

@rjl493456442 rjl493456442 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some nitpicks.

@holiman holiman merged commit 3531ca2 into ethereum:master Sep 28, 2021
sidhujag pushed a commit to syscoin/go-ethereum that referenced this pull request Sep 28, 2021
…#23632)

This PR fixes an issue in traceChain, where the statedb Commit operation was performed asynchronously with dereference-operations agains the underlying trie.Database instance. Due to how the reference counting works within the trie database (where parent count is recursively updated when new parents are added), doing dereferencing in the middle of Commit can cause the refcount to become wrong, leading to an inconsistent state. 

This was fixed by doing Commit/Deref from the same routine.
@holiman holiman deleted the tracechain_fixup branch November 10, 2021 18:32
maoueh pushed a commit to streamingfast/go-ethereum that referenced this pull request Jan 31, 2022
* eth/tracers: implement debug.intermediateRoots (ethereum#23594)

This PR implements a new debug method, which I've talked briefly about to some other client developers. It allows the caller to obtain the intermediate state roots for a block (which might be either a canon block or a 'bad' block).
Signed-off-by: wenbiao <delweng@gmail.com>

* core, rpc: disable memory output by default in traces (ethereum#23558)

* core: cmd: invert disableMemory

* core: fix missed inversion

* cmd/evm: preserve Flags but change default value

* Apply suggestions from code review

Co-authored-by: Martin Holst Swende <martin@swende.se>

Co-authored-by: Martin Holst Swende <martin@swende.se>
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: abort evm execution when trace is aborted (ethereum#23580)

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: avoid unsyncronized mutations on trie database (ethereum#23632)

This PR fixes an issue in traceChain, where the statedb Commit operation was performed asynchronously with dereference-operations agains the underlying trie.Database instance. Due to how the reference counting works within the trie database (where parent count is recursively updated when new parents are added), doing dereferencing in the middle of Commit can cause the refcount to become wrong, leading to an inconsistent state. 

This was fixed by doing Commit/Deref from the same routine.  
Signed-off-by: wenbiao <delweng@gmail.com>

* core,eth: call frame tracing (ethereum#23087)

This change introduces 2 new optional methods; `enter()` and `exit()` for js tracers, and makes `step()` optiona. The two new methods are invoked when entering and exiting a call frame (but not invoked for the outermost scope, which has it's own methods). Currently these are the data fields passed to each of them:

    enter: type (opcode), from, to, input, gas, value
    exit: output, gasUsed, error

The PR also comes with a re-write of the callTracer. As a backup we keep the previous tracing script under the name `callTracerLegacy`. Behaviour of both tracers are equivalent for the most part, although there are some small differences (improvements), where the new tracer is more correct / has more information.

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: re-write of 4byte tracer using enter/exit (ethereum#23622)

* eth/tracers: add re-write of 4byte tracer using enter/exit

* eth/tracers: fix 4byte indent
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: tx.BaseFee not implemented

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: do the JSON serialization via .js to capture C faults

Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: fix callTracer fault handling (ethereum#23667)

* eth/tracers: fix calltracer fault handling

* eth/tracers: fix calltracer indentation
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: invoke enter/exit on 0-value calls to inex accounts (ethereum#23828)

Signed-off-by: wenbiao <delweng@gmail.com>

* eth: make traceChain avoid OOM on long-running tracing (ethereum#23736)

This PR changes long-running chain tracing, so that it at some points releases the memory trie db, and switch over to a fresh disk-backed trie.
Signed-off-by: wenbiao <delweng@gmail.com>

* eth/tracers: expose contextual infos (block hash, tx hash, tx index)

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: redefine Context

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: support for golang tracers + add golang callTracer (ethereum#23708)

* eth/tracers: add basic native loader

* eth/tracers: add GetResult to tracer interface

* eth/tracers: add native call tracer

* eth/tracers: fix call tracer json result

* eth/tracers: minor fix

* eth/tracers: fix

* eth/tracers: fix benchTracer

* eth/tracers: test native call tracer

* eth/tracers: fix

* eth/tracers: rm extra make

Co-authored-by: Martin Holst Swende <martin@swende.se>

* eth/tracers: rm extra make

* eth/tracers: make callFrame private

* eth/tracers: clean-up and comments

* eth/tracers: add license

* eth/tracers: rework the model a bit

* eth/tracers: move tracecall tests to subpackage

* cmd/geth: load native tracers

* eth/tracers: minor fix

* eth/tracers: impl stop

* eth/tracers: add native noop tracer

* renamings

Co-authored-by: Martin Holst Swende <martin@swende.se>

* eth/tracers: more renamings

* eth/tracers: make jstracer non-exported, avoid cast

* eth/tracers, core/vm: rename vm.Tracer to vm.EVMLogger for clarity

* eth/tracers: minor comment fix

* eth/tracers/testing: lint nitpicks

* core,eth: cancel evm on nativecalltracer stop

* Revert "core,eth: cancel evm on nativecalltracer stop"

This reverts commit 01bb908.

* eth/tracers: linter nits

* eth/tracers: fix output on err

Co-authored-by: Martin Holst Swende <martin@swende.se>
Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: make native calltracer default (ethereum#23867)

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: package restructuring (ethereum#23857)

* eth/tracers: restructure tracer package

* core/vm/runtime: load js tracers

* eth/tracers: mv bigint js code to own file

* eth/tracers: add method docs for native tracers

* eth/tracers: minor doc fix

* core,eth: cancel evm on nativecalltracer stop

* core/vm: fix failing test

Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: ethapi.TransactionArgs was not merged

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: fix the api_test with ErrInsufficientFunds to ErrInsufficientFundsForTransfer

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: check posa before statedb.Prepare in IntermiateRoots api

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: make js calltracer default, compatible with old version

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: fix the default callTrace name of callTracerJs

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* Revert "eth/tracers: fix the default callTrace name of callTracerJs"

This reverts commit 62a3bc215d9f07e422a4c659289bb3ba4f9ed2fa.

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* Revert "eth/tracers: make js calltracer default, compatible with old version"

This reverts commit 85ef42c0ea651f0b228d4209b1b2598b24e12f1f.

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

* eth/tracers: fix the variable race condition

Signed-off-by: wenbiao <wenbiao.zheng@ambergroup.io>

Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Marius van der Wijden <m.vanderwijden@live.de>
Co-authored-by: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Co-authored-by: Péter Szilágyi <peterke@gmail.com>
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
yongjun925 pushed a commit to DODOEX/go-ethereum that referenced this pull request Dec 3, 2022
…#23632)

This PR fixes an issue in traceChain, where the statedb Commit operation was performed asynchronously with dereference-operations agains the underlying trie.Database instance. Due to how the reference counting works within the trie database (where parent count is recursively updated when new parents are added), doing dereferencing in the middle of Commit can cause the refcount to become wrong, leading to an inconsistent state. 

This was fixed by doing Commit/Deref from the same routine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

inconsistent JS tracer fails
2 participants