Running the pre-built indexer generates 'next round to account' error after a while without apparent reason #1507

MatteusDeloge · 2023-03-16T09:04:19Z

Description

I'm running the pre-built indexer (v2.15.3) on an AWS EC2 instance together with an archival node, and in the last 48h I've had the same issue twice: the indexer stops pushing data to the Postgres database (Timescale Cloud, so fully managed) with the following error:

{"error":"Process() handler err: AddBlock() err: TxWithRetry() err: attemptTx() err: AddBlock() adding block round 27670983 but next round to account is 27670982","level":"error","msg":"block 27670983 import failed","time":"2023-03-16T08:02:45Z"}

The command I use to run the indexer is: algorand-indexer daemon --data-dir /home/ubuntu/indexerdata -d /var/lib/algorand --postgres "$TIMESCALE_PROD"

The node itself is still properly running at that point in time, output of goal node status -w 1000 is:

Last committed block: 27681419
Time since last block: 0.6s
Sync Time: 0.0s
Last consensus protocol: https://github.com/algorandfoundation/specs/tree/44fa607d6051730f5264526bf3c108d51f0eadb6
Next consensus protocol: https://github.com/algorandfoundation/specs/tree/44fa607d6051730f5264526bf3c108d51f0eadb6
Round for next consensus protocol: 27681420
Next consensus protocol supported: true
Last Catchpoint: 27680000#NA63SDQJD63NR3QPNC2NXYV6FPUJWHNJY6DDAMGURQ76CT2MYUUQ
Genesis ID: mainnet-v1.0
Genesis hash: wGHE2Pwdvd7S12BL5FaOP20EGYesN73ktiC1qzkkit8=

When I restart the indexer, all I get is the prompt to re-initialise the ledger:

{"error":"MakeProcessorWithLedgerInit() err: InitializeLedger() simple catchup err: RunMigration() err: MakeProcessor() err: the ledger cache is ahead of the required round and must be re-initialized","level":"error","msg":"blockprocessor.MakeProcessor() err MakeProcessorWithLedgerInit() err: InitializeLedger() simple catchup err: RunMigration() err: MakeProcessor() err: the ledger cache is ahead of the required round and must be re-initialized","time":"2023-03-16T08:06:41Z"}

Currently the only way I know to get it up and running again is by clearing out the indexer's data directory and starting sync again from the nearest catchpoint: algorand-indexer daemon --data-dir /home/ubuntu/indexerdata -d /var/lib/algorand --postgres "$TIMESCALE_PROD" --catchpoint "27670000#74HTMMCL63E74B43FLS3LHHQRMDO54HTF6FKC2JZK3K3PXNY6ZYQ"

Is this a known issue? Can I somehow make the indexer more robust to catch these kinds of issues?

As this issue seems to be fully indexer related (unless I'm missing something here), I thought it might be good to discuss this here. We specifically use the provided indexer so we don't have to write our own code and can rely on the stability provided out of the box, so looking forward to solving this!

Our environment

Software version: 3.14.2.stable
Node status: see above
Indexer version: 2.15.3
Server: AWS EC2 c5.large running Ubuntu 20.04
Postgres: Timescale Cloud (v2.10.0) running Postgres v14.7

Steps to reproduce

Unknown, but seems to be happening a lot the past week.

The text was updated successfully, but these errors were encountered:

winder · 2023-03-23T15:14:56Z

I believe this sort of thing may happen if you have multiple Indexer writers running at the same time. Resetting the data directory is the right way to recover.

MatteusDeloge · 2023-03-28T11:10:18Z

Normally I just have a single writer in a single process, so not really sure if this is the case here. If it is, then it seems like the problem exist within this version of the indexer.

shiqizng · 2023-05-16T20:56:35Z

I'm unable to reproduce the error with v2.15.3 indexer. I recommend switch to using Conduit and you'll not get this error.

gmalouf · 2024-05-23T15:36:00Z

Indexer 2.x was retired in 2023.

MatteusDeloge added the new-bug Bug report that needs triage label Mar 16, 2023

winder added Team Lamprey and removed new-bug Bug report that needs triage labels Mar 23, 2023

gmalouf closed this as completed May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running the pre-built indexer generates 'next round to account' error after a while without apparent reason #1507

Running the pre-built indexer generates 'next round to account' error after a while without apparent reason #1507

MatteusDeloge commented Mar 16, 2023

winder commented Mar 23, 2023

MatteusDeloge commented Mar 28, 2023

shiqizng commented May 16, 2023 •

edited

gmalouf commented May 23, 2024

Running the pre-built indexer generates 'next round to account' error after a while without apparent reason #1507

Running the pre-built indexer generates 'next round to account' error after a while without apparent reason #1507

Comments

MatteusDeloge commented Mar 16, 2023

Description

Our environment

Steps to reproduce

winder commented Mar 23, 2023

MatteusDeloge commented Mar 28, 2023

shiqizng commented May 16, 2023 • edited

gmalouf commented May 23, 2024

shiqizng commented May 16, 2023 •

edited