Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InvalidSignature errors caused by race between duties & attestation data #5687

Open
michaelsproul opened this issue May 2, 2024 · 2 comments
Labels
bug Something isn't working UX-and-logs val-client Relates to the validator client binary

Comments

@michaelsproul
Copy link
Member

Background

The algorithm the VC uses to attest is:

  1. Regularly fetch attestation duties from the BN(s).
  2. When a validator is assigned to attest, fetch attestation data from the BN, sign it and publish it.

It handles re-orgs by flushing the duties whenever the BN indicates a different dependent_root in its attestation duties response.

The Bug

The problem is, the BN can re-org in the time between the VC requesting duties & requesting the attestation data. This leads to the VC signing attestations at the wrong slots with the wrong validator indices, which when they are validated by the BN correctly trigger InvalidSignature errors.

In detail, the sequence of events leading to InvalidSignature errors is:

  1. The VC requests duties while the BN is still a bit behind. The BN advances the state and serves the duties.
  2. The BN catches up to the head (which changes the duties).
  3. Before the VC has a chance to learn about the new duties, it requests attestation data based on the old duties.
  4. The VC signs the attestation data under the assumption that the old duties are correct. This means it is essentially signing with the wrong shuffling: if validator i was in committee j at slot n under the old duties, then it is unlikely to be at the same position after the duties change.
  5. When the BN receives the signed attestations, it (correctly) uses the beacon_block_root to calculate the committees, and finds that the signatures are wrong.

Steps to resolve

It's not immediately obvious how to fix this.

Unfortunately the beacon API spec does not define the dependent_root as part of the AttestationData response. If it did, the VC could detect the change in duties at the point where it requests the AttestationData, and either abort the request (and rely on it being re-triggered at the next slot), or wait for the new duties to be loaded.

Other clients use the SSE endpoint to be informed of dependent_root changes as soon as they happen, but even if we take this approach, there will still be a race as processing the SSE event and requesting the attestation data can happen concurrently.

Workaround

Once the BN is in sync the errors will disappear.

Additional Info

This issue is possibly more obvious in v5.2.0 and later, as we've made the BN more likely to return out of date duties:

This is both a simplification and a liveness benefit, so I don't think we should revert this change. Particularly on small or struggling networks, being able to produce blocks and attestations when the chain appears stalled is very useful.

@michaelsproul michaelsproul added bug Something isn't working val-client Relates to the validator client binary UX-and-logs labels May 2, 2024
@dapplion
Copy link
Collaborator

dapplion commented May 2, 2024

Other clients use the SSE endpoint to be informed of dependent_root changes as soon as they happen, but even if we take this approach, there will still be a race as processing the SSE event and requesting the attestation data can happen concurrently.

True but it will greatly reduce the window of time the race condition can happen

@michaelsproul
Copy link
Member Author

Not easy to implement quickly, unfortunately. I think Sean had a prototype a while back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working UX-and-logs val-client Relates to the validator client binary
Projects
None yet
Development

No branches or pull requests

2 participants