New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inspect: add inspect mode for debugging crashed tendermint node #6785
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6785 +/- ##
==========================================
- Coverage 62.88% 62.66% -0.22%
==========================================
Files 307 309 +2
Lines 40464 40564 +100
==========================================
- Hits 25447 25421 -26
- Misses 13231 13343 +112
- Partials 1786 1800 +14
|
This pull request introduces 1 alert when merging e2a6522 into 9a2a7d4 - view on LGTM.com new alerts:
|
I like the approach here and the clean up!! |
fbaf7d4
to
03ee71e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Work!
I think just some minor touch ups / linting is required before this can be merged.
As a high level question, what would happen if I had a node running and then I inspected it at the same time. Would it work as expected or error? As a guess, this would depend on whether the db supports additional read-only connections right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great Work!
I think just some minor touch ups / linting is required before this can be merged.
As a high level question, what would happen if I had a node running and then I inspected it at the same time. Would it work as expected or error? As a guess, this would depend on whether the db supports additional read-only connections right?
Yeah, this would be reliant on how the DB manages the files for the storage. I'm not yet sure users will want to run this at the same time as a node. The node already provides RPC so this would be somewhat redundant. I tried running a node of each DB type alongside the corresponding inspect command one by one.
When trying this right now:
Error that is reported by goleveldb
:
./build/tendermint inspect
ERROR: failed to initialize database: resource temporarily unavailable
Error that is reported by cleveldb
:
./build/tendermint inspect --db-backend cleveldb
ERROR: failed to initialize database: IO error: lock /home/william/.tendermint/data/blockstore.db/LOCK: Resource temporarily unavailable
badgerdb
:
./build/tendermint inspect --db-backend badgerdb
ERROR: failed to initialize database: Cannot acquire directory lock on "/home/william/.tendermint/data/blockstore". Another process is using this Badger database.: resource temporarily unavailable
boltdb
hangs trying to initialize the db connection
rocksdb
:
./build/tendermint inspect --db-backend rocksdb
ERROR: failed to initialize database: IO error: While lock file: /home/william/.tendermint/data/blockstore.db/LOCK: Resource temporarily unavailable
badgerdb
:
./build/tendermint inspect --db-backend badgerdb
ERROR: failed to initialize database: Cannot acquire directory lock on "/home/william/.tendermint/data/blockstore". Another process is using this Badger database.: resource temporarily unavailable
require.NoError(t, d.Run(ctx)) | ||
}() | ||
// FIXME: used to induce context switch. | ||
// Determine more deterministic method for prompting a context switch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I also filed #6858 to track this more generally.
EDIT: Updated, see [comment below]( #6785 (comment)) This change adds a sketch of the `Debug` mode. This change adds a `Debug` struct to the node package. This `Debug` struct is intended to be created and started by a command in the `cmd` directory. The `Debug` struct runs the RPC server on the data directories: both the state store and the block store. This change required a good deal of refactoring. Namely, a new `rpc.go` file was added to the `node` package. This file encapsulates functions for starting RPC servers used by nodes. A potential additional change is to further factor this code into shared code _in_ the `rpc` package. Minor API tweaks were also made that seemed appropriate such as the mechanism for fetching routes from the `rpc/core` package. Additional work is required to register the `Debug` service as a command in the `cmd` directory but I am looking for feedback on if this direction seems appropriate before diving much further. closes: #5908
resurrect the inspect command from #6785 Co-authored-by: Sam Kleinman <garen@tychoish.com> Co-authored-by: Thane Thomson <connect@thanethomson.com> Co-authored-by: Callum Waters <cmwaters19@gmail.com>
EDIT: Updated, see comment below
This change adds a sketch of the
Debug
mode.This change adds a
Debug
struct to the node package. ThisDebug
struct is intended to be created and started by a command in thecmd
directory. TheDebug
struct runs the RPC server on the data directories: both the state store and the block store.This change required a good deal of refactoring. Namely, a new
rpc.go
file was added to thenode
package. This file encapsulates functions for starting RPC servers used by nodes. A potential additional change is to further factor this code into shared code in therpc
package.Minor API tweaks were also made that seemed appropriate such as the mechanism for fetching routes from the
rpc/core
package.Additional work is required to register the
Debug
service as a command in thecmd
directory but I am looking for feedback on if this direction seems appropriate before diving much further.closes: #5908