Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to build beacon chain: Head block not found in store #2028

Closed
v3rm0n opened this issue Dec 1, 2020 · 7 comments
Closed

Failed to build beacon chain: Head block not found in store #2028

v3rm0n opened this issue Dec 1, 2020 · 7 comments
Assignees

Comments

@v3rm0n
Copy link

v3rm0n commented Dec 1, 2020

Description

Beacon chain was working properly on Pyrmont for a few days and then it stopped and doesn't start up again.

Version

Lighthouse/v1.0.3-65dcdc36+

Present Behaviour

Beacon chain stops right after starting up with error: Failed to build beacon chain: Head block not found in store

Dec 01 08:35:21.582 INFO Lighthouse started                      version: Lighthouse/v1.0.3-65dcdc36+
Dec 01 08:35:21.582 INFO Configured for network                  name: pyrmont
Dec 01 08:35:21.582 INFO Data directory initialised              datadir: /var/lib/lighthouse
Dec 01 08:35:21.582 WARN Running HTTP server on port 5052
Dec 01 08:35:21.592 INFO Deposit contract                        address: 0x8c5fecdc472e27bc447696f431e425d02dd46a8c, deploy_block: 3743587
Dec 01 08:35:21.623 INFO Hot-Cold DB initialized                 split_state: 0xaebea146f4d6c06ad2240530650c14556cbc45f41eec4d245db4a186da111a22, split_slot: 90880, version: 2, service: freezer_db
Dec 01 08:35:21.624 INFO Starting beacon chain                   method: resume, service: beacon
Dec 01 08:35:31.905 INFO Block production enabled                method: json rpc via http, endpoints: ["http://127.0.0.1:8545"]
Dec 01 08:35:32.134 CRIT Failed to start beacon node             reason: Failed to build beacon chain: Head block not found in store
Dec 01 08:35:32.134 INFO Internal shutdown received              reason: Failed to start beacon node
Dec 01 08:35:32.134 INFO Shutting down..

Logs from before restarting it

Dec 01 07:43:47.258 INFO Sync state updated                      new_state: Syncing Finalized Chain, old_state: Searching syncing peers, service: sync
Dec 01 07:43:48.123 WARN BlockProcessingFailure                  outcome: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), msg: unexpected condition in processing block.
Dec 01 07:43:49.001 INFO Syncing                                 est_time: --, distance: 1343 slots (4 hrs 28 mins), peers: 55, service: slot_notifier
Dec 01 07:44:01.000 INFO Syncing                                 est_time: --, distance: 1344 slots (4 hrs 28 mins), peers: 55, service: slot_notifier
Dec 01 07:44:13.000 INFO Syncing                                 est_time: --, distance: 1345 slots (4 hrs 29 mins), peers: 55, service: slot_notifier
Dec 01 07:44:14.550 WARN Batch failed to download. Dropping chain scoring peers, batch_epoch: 2840, score_adjustment: Low Tolerance Error, chain: 18141939244923768013, service: sync
Dec 01 07:44:15.641 WARN BlockProcessingFailure                  outcome: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), msg: unexpected condition in processing block.
Dec 01 07:44:25.000 INFO Syncing                                 est_time: --, distance: 1346 slots (4 hrs 29 mins), peers: 55, service: slot_notifier
Dec 01 07:44:31.511 WARN Batch failed to download. Dropping chain scoring peers, batch_epoch: 2840, score_adjustment: Low Tolerance Error, chain: 16675040628877297081, service: sync
Dec 01 07:44:31.514 INFO Sync state updated                      new_state: Searching syncing peers, old_state: Syncing Finalized Chain, service: sync
Dec 01 07:44:31.565 INFO Sync state updated                      new_state: Syncing Finalized Chain, old_state: Searching syncing peers, service: sync
Dec 01 07:44:33.732 WARN BlockProcessingFailure                  outcome: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), msg: unexpected condition in processing block.
Dec 01 07:44:37.000 INFO Syncing                                 est_time: --, distance: 1347 slots (4 hrs 29 mins), peers: 55, service: slot_notifier
Dec 01 07:44:42.479 WARN Batch failed to download. Dropping chain scoring peers, batch_epoch: 2840, score_adjustment: Low Tolerance Error, chain: 16675040628877297081, service: sync
Dec 01 07:44:42.482 INFO Sync state updated                      new_state: Searching syncing peers, old_state: Syncing Finalized Chain, service: sync
Dec 01 07:44:42.511 INFO Sync state updated                      new_state: Syncing Finalized Chain, old_state: Searching syncing peers, service: sync
Dec 01 07:44:43.134 WARN BlockProcessingFailure                  outcome: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), msg: unexpected condition in processing block.
Dec 01 07:44:44.110 INFO Shutting down..
Dec 01 07:44:44.112 INFO Saved DHT state                         service: network
Dec 01 07:44:44.112 INFO Network service shutdown                service: network
Dec 01 07:44:44.459 INFO Saved beacon chain to disk              service: beacon

Expected Behaviour

It should start up

@michaelsproul
Copy link
Member

Did you see any other database-related warnings before the error started occurring? Did your disk fill up/did you experience a power outage/anything else unusual?

Are you running on a VPS? Bare metal? Under a hypervisor/VM?

Did you see any segfaults?

Sorry for the barrage of questions, but this is really the best way for us to work out under what conditions these errors occur 😅 In the meantime, you'll have to purge your database and resync to get back online (rm -r /var/lib/lighthouse/beacon).

Potentially related issues:

@v3rm0n
Copy link
Author

v3rm0n commented Dec 1, 2020

I am often on the other end of the issues so I totally understand :)

Running currently on AWS EC2 using Docker(ECS), single c5.xlarge instance .
Three containers run on the same machine:

  • ethereum/client-go:latest: ["--goerli","--http","--http.addr","0.0.0.0","--datadir","/var/lib/goethereum"]
  • sigp/lighthouse: lighthouse ["beacon_node","--datadir","/var/lib/lighthouse","--network","pyrmont","--staking","--eth1","--eth1-endpoint","http://127.0.0.1:8545","--metrics"]
  • sigp/lighthouse: lighthouse ["--network","pyrmont","vc","--datadir","/var/lib/lighthouse","--http"]

Lighthouse containers have a bind mount from /data/lighthouse to /var/lib/lighthouse
Geth has a bind mount from /data to /var/lib/goethereum.

/data is a 120GB general purpose EBS volume with 105GB free space.

Went through the logs to see where this MissingBeaconBlock started:

Dec 01 03:15:17.656 WARN Banning peer that is currently disconnecting, peer_id: 16Uiu2HAmN92gRashiypmBqvL1uX4KftjT7xG1ZY8MLNju1YYNK8a, service: libp2p
Dec 01 03:15:24.795 INFO New block received                      hash: 0xb866…0856, slot: 90976
Dec 01 03:15:25.000 INFO Synced                                  slot: 90976, block:    …  empty, epoch: 2843, finalized_epoch: 2840, finalized_root: 0x8251…f9a1, peers: 52, service: slot_notifier
Dec 01 03:15:25.162 CRIT Beacon block processing error           error: DBError(DBError { message: "Error { message: \"IO error: /var/lib/lighthouse/beacon/chain_db/106850.log: Cannot allocate memory\" }" }), service: beacon
Dec 01 03:15:35.402 CRIT Beacon block processing error           error: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), service: beacon
Dec 01 03:15:35.403 WARN Unexpected block processing error       error: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), service: sync
Dec 01 03:15:37.000 INFO Synced                                  slot: 90977, block:    …  empty, epoch: 2843, finalized_epoch: 2840, finalized_root: 0x8251…f9a1, peers: 52, service: slot_notifier
Dec 01 03:15:44.620 CRIT Beacon block processing error           error: MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856), service: beacon
Dec 01 03:15:44.620 WARN Invalid parent chain                    last_peer: 16Uiu2HAmQ4UZHPNL3FzufmQp6L6XEGT9jgyxifAbMv4tDygJ6mo1, outcome: BeaconChainError(MissingBeaconBlock(0xb866258c3d9956f6901712909729338cc2a1651383e12b4cdf9b202b328b0856)), score_adjustment: Mid Tolerance Error, service: sync

@michaelsproul
Copy link
Member

Very interesting. Thank you so much for the log. I haven't seen that Cannot allocate memory error before from the database. But it seems we should fail the block import, not partially import it (which is what the MissingBeaconBlock error suggests).

@v3rm0n
Copy link
Author

v3rm0n commented Dec 1, 2020

FYI I have added quite arbitrary memory limits to the Docker processes in ECS since it doesn't let you start a task without a limit. For beacon process I had 2500 (which I now changed to 3000 since the geth node was not really using its share).
Not sure if related...

@michaelsproul
Copy link
Member

That probably caused the allocation error, but I've found the reason we weren't recovering (a non-atomic update to the fork choice datastructure). Looking into possible fixes now.

Thank you

@newearthmartin
Copy link

Hi I'm also experiencing "Head block not found in store" on version v2.1.5-9bc9527
Should I open a new ticket or post details in here?

@newearthmartin
Copy link

(I'm just going to post it here to save time)

My beacon node has been working for almost a year correctly on mainnet but now it won't start. Here is my log:

Mar 21 23:38:12.484 INFO Logging to file                         path: ".../.lighthouse/mainnet/beacon/logs/beacon.log"
Mar 21 23:38:12.485 INFO Lighthouse started                      version: Lighthouse/v2.1.5-9bc9527
Mar 21 23:38:12.485 INFO Configured for network                  name: mainnet
Mar 21 23:38:12.486 INFO Data directory initialised              datadir: .../.lighthouse/mainnet
Mar 21 23:38:12.486 WARN Running HTTP server on port 5052
Mar 21 23:38:12.487 INFO Deposit contract                        address: ..., deploy_block: ...
Mar 21 23:38:12.788 INFO Hot-Cold DB initialized                 split_state: 0x1c9ca63eade37553e088d1b5c88530950a012dbe551d7aef17bb3c3fcc42e17d, split_slot: 3417696, version: 8, service: freezer_db
Mar 21 23:38:12.794 INFO Starting beacon chain                   method: resume, service: beacon
Mar 21 23:38:46.731 INFO Block production enabled                method: json rpc via http, endpoints: ["http://127.0.0.1:8545/", "https://mainnet.infura.io/"]
Mar 21 23:38:47.998 CRIT Failed to start beacon node             reason: Failed to build beacon chain: Head block not found in store
Mar 21 23:38:48.000 INFO Internal shutdown received              reason: Failed to start beacon node
Mar 21 23:38:48.000 INFO Shutting down..                         reason: Failure("Failed to start beacon node")
Failed to start beacon node

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants