Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#763 Switches SHA1 for SHA2 #764

Merged
merged 1 commit into from
Jan 7, 2023
Merged

#763 Switches SHA1 for SHA2 #764

merged 1 commit into from
Jan 7, 2023

Conversation

jeramyRR
Copy link
Contributor

@jeramyRR jeramyRR commented Jan 6, 2023

I submitted an issue (#763) about the problems depending on SHA1 is causing with some build systems and security. This PR removes the SHA1 library and replaces it with the SHA2 library.

I'm not sure what side effects this will have on the build though. The change is in meta/build.rs, and it looks like SHA1 is only being used to name a grammar file that gets downloaded and compared before deciding to create a new grammar file. This change would definitely cause that step to happen with the first new build after merge.

This removes the SHA1 library and replaces it with the SHA2 library.
@jeramyRR jeramyRR requested a review from a team as a code owner January 6, 2023 18:22
@jeramyRR jeramyRR requested review from tomtau and removed request for a team January 6, 2023 18:22
@tomtau tomtau requested a review from CAD97 January 7, 2023 06:57
Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks all right. And yes, even though sha1 is only used in build dependencies, it could be an unnecessary red flag if it shows up in a dependency graph.
Originally the sha1 dependency was brought in with the bootstrap process: #290 I think it could be replaced by sha2 (or other reasonable hash functions), but not sure whether it breaks anything (e.g. if there was a need for git hash-object compatibility or something like that). Will merge it, but CC @CAD97 before releasing it.

@CAD97
Copy link
Contributor

CAD97 commented Jan 7, 2023

TL;DR change is fine to publish.

The use of sha-1 was just an arbitrary choice for a file hash with no compatibility concerns. The use of sha-2 will of course change the file hash and trigger a re-bootstrap, but this only impacts developers compiling pest from source.

This only ever happens in a source build where grammar.pest is present; in published builds we just use the processed version; doing bootstrap is cyclic otherwise. (This change has no functional impact to downstream other than the change in the build dependency graph.)

Ideally we'd even strip the buildscript and build dependency from the published package entirely; the buildscript is just for convenience when iterating.

I've never been super enthused about the pest bootstrap process. It's amazing that it mostly just works, but its integration into cargo is sketchy at best (e.g. the unused build dependency on the sha hash for downstream).

The check for the need to re-bootstrap being in the buildscript is a concession to git and path dependencies; if the bootstrap is up to date, you don't need to have the bootstrap binary available. The alternative would be always dynamically calling into the bootstrap binary for unpackaged builds — the modification check can still be done there, but now you need to run a manual bootstrap after cargo clean even if you haven't modified the meta grammar.

When I first wrote the bootstrap process, recursively invoking cargo from a buildscript would essentially always deadlock and there wasn't a good workaround available (at least not one that I could find); this is why the first manual bootstrap is required, and the buildscript finds and executes the bootstrap binary directly rather than going through cargo. In the intermediate time, things have evolved; it should be able to do roughly $CARGO run --manifest-path "$CARGO_WORKSPACE_DIR/bootstrap/Cargo.toml" --target-dir $OUT_DIR and eliminate the need for the initial manual bootstrap and fragile "dynamic linking". But doing it this way makes a completely separate target dir (to avoid the two cargos using the same target dir concurrently) meaning the dependencies get separately recompiled rather than shared with the primary workspace compilation, so it's got its own issues as well... bootstrap is a hard problem with only solutions which are bad in different ways.

Actually, even without the --target-dir $OUT_DIR (which AIUI is still more technically correct to use), the last time I tried recursively calling cargo it didn't deadlock, despite the fact I know that doing so did when I first wrote pest's bootstrap path. I don't know if this is because of changes in my system (e.g. I use a globally set $CARGO_TARGET_DIR to share intermediate compilation artifacts between workspaces) or in cargo. Maybe there's a better bootstrap path available nowadays, but if there is, I don't know it.

Any further effort I put into the bootstrap problem is going into the "precompiled proc macro" path rather than the "preexecuted code generation" solution, though; doing it by the former method also benefits downstream and the latter completely removes macro hygiene.

@CAD97
Copy link
Contributor

CAD97 commented Jan 7, 2023

For what little it's worth, the usage of sha-1 here doesn't care about any of the collision attacks, as it's just used as a fingerprint for detecting changes to a trusted file. But for the same reason, there's no reason not to use a less broken (fast and stable) content hash without such attacks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants