Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIKE: Make indexer 2.14 with local ledger more resilient in case of restarts. #1200

Open
urtho opened this issue Aug 23, 2022 · 3 comments
Open
Labels
new-feature-request Feature request that needs triage Team Lamprey

Comments

@urtho
Copy link
Contributor

urtho commented Aug 23, 2022

Problem

We've deployed 2.14.0rc3 on testnet on 25+ nodes.
After two days of testing (that includes random restarts) we've observed ledger cache ahead of postgres ledger which requires manual intervention. Happened to 3 different nodes.

2.14 is our first indexer with local ledger. We've skipped 2.12 and 2.13

{"error":"MakeProcessorWithLedgerInit() err: InitializeLedger() simple catchup err: RunMigration() err: MakeProcessor() err: the ledger cache is ahead of the required round and must be re-initialized","level":"error","msg":"blockprocessor.MakeProcessor() err MakeProcessorWithLedgerInit() err: InitializeLedger() simple catchup err: RunMigration() err: MakeProcessor() err: the ledger cache is ahead of the required round and must be re-initialized","time":"2022-08-23T08:26:37Z"}

Probably not generally fixable with current approach but maybe a "one block off" situation could be addressed.

Urgency

Not very urgent but all shutdowns were "clean" ones so statistically this is going to hurt.

Acceptance Criteria

  1. Use the MaxAccountLookback in the ledger to fetch recent StateDelta objects.
  2. If the local ledger is ahead of postgres, use the historic StateDelta instead of computing a new one.
@urtho urtho added the new-feature-request Feature request that needs triage label Aug 23, 2022
@chaihoang chaihoang assigned shiqizng and unassigned shiqizng Sep 8, 2022
@winder winder changed the title Make indexer 2.14 with local ledger more resilient in case of restarts. SPIKE: Make indexer 2.14 with local ledger more resilient in case of restarts. Sep 22, 2022
@shiqizng shiqizng self-assigned this Oct 3, 2022
@fabrice102
Copy link
Contributor

@urtho How did you manually fix the issue without a full reset of the indexer?

@urtho
Copy link
Contributor Author

urtho commented May 31, 2023

I just do fast catchup from a matching catchup from this list https://algorand-catchpoints.s3.us-east-2.amazonaws.com/consolidated/mainnet_catchpoints.txt

So downtime is only 40 minutes.

@fabrice102
Copy link
Contributor

Thanks! Unfortunately, in my case, this did not work out: the indexer started indexing from start again... I'm not completely sure why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature-request Feature request that needs triage Team Lamprey
Projects
None yet
Development

No branches or pull requests

4 participants