Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2/*Selective* CAR creators without duplicate puts may not be traversing complex DAGs #325

Open
rvagg opened this issue Aug 10, 2022 · 0 comments
Assignees

Comments

@rvagg
Copy link
Member

rvagg commented Aug 10, 2022

We have this in the LinkSystem config for the selective CAR creation utilities in v2:

LinkVisitOnlyOnce:              !opts.BlockstoreAllowDuplicatePuts

These two things should probably not be directly related. Duplicate puts should be stopped at the CAR write output, but traversal should be able to roam through the DAG without skips when we're doing a non-exhaustive selector on complex DAGs.

i.e. if we have a non-exhaustive selector and a DAG that contains multiple interlinked blocks (the definition of "complex DAGs" here is a bit squishy because de-duping of UnixFS data will often mean a UnixFS DAG has this property too), the traversal may need to re-visit blocks under different selector conditions. But LinkVisitOnlyOnce is only intended for the case where we're absolutely sure that re-visits are pointless.

v0's SelectiveCar handles this itself by keeping track of CIDs written and not rewriting out the ones that it's already seen, but it still lets the traversal go over all blocks. But we allow a TraverseLinksOnlyOnce option to be passed in by the user where they know it's OK (e.g. when they're doing an ExploreAll). Perhaps we should follow that pattern here.

I think that we probably should be solving this in the traversal engine instead, making LinkVisitOnlyOnce redundant and deprecating it because our selectors are smart enough to know not to revisit where it doesn't need to.

@rvagg rvagg self-assigned this Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🗄️ Backlog
Development

No branches or pull requests

1 participant