Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify check for truncated CARs #113

Closed
olizilla opened this issue Feb 21, 2023 · 2 comments
Closed

Simplify check for truncated CARs #113

olizilla opened this issue Feb 21, 2023 · 2 comments

Comments

@olizilla
Copy link
Contributor

olizilla commented Feb 21, 2023

Since #100 landed we have a second worker process that reads the entire CAR from S3 and uses linkdex to determine if the CAR is complete as a signal that the CAR was probably not truncated during upload.

We could simplify this as we have a local kubo sidecar for every pickup worker with the entire DAG in it.

  • add a through stream to decode the CAR as we send it to S3.
  • count each block in the CAR as we send it.
  • if the upload completes successfully then do a ipfs dag stat on the root CID to find the expected block count for the DAG and compare it to our count of blocks in the CAR.

Both ipfs dag stat and ipfs dag export dedupe repeated blocks, so the count of blocks in an untruncated CAR and the count of blocks in the dag should always match. We should attempt to verify the hash of the last block in the CAR as well to check that the last block is complete.

@olizilla
Copy link
Contributor Author

Work is being done over in https://www.notion.so/pl-strflt/Streaming-CARs-bfa54196ed0740eea3c23dd70efbe07b and ipfs/specs#332 to find an officially supported mechanism to verify a streaming CAR transfer.

@olizilla
Copy link
Contributor Author

we can't dag stat after writing the car as we don't know when repo gc will occur. We have simplified this sufficiently by streaming the CAR back through lindex as part of the pickup containers job.

see: #128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant