Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite revendoring script #594

Merged
merged 3 commits into from
Mar 31, 2023

Conversation

apoelstra
Copy link
Member

For Nix purposes I need the revendoring script to work without network access and without user interaction. I also realized it would be convenient if the script could figure out what the right version prefix is supposed to be. Then I noticed some shellcheck issues.

Anyway I just rewrote the whole thing. I'm now able to run this script within nix and vet that the current contents of the depend/ directory are consistent with the secp256k1-HEAD-revision.txt, for all commits.

@apoelstra
Copy link
Member Author

cc @sanket1729 @tcharding who I think are the primary users of this script beside me

This rewrite:
   * Fixes some shellcheck issues (bad quoting, use of | instead of ||
     near the beginning of the file)
   * Automatically computes the version prefix, depend directory, etc.,
     and provides instructions to override this with env vars if the
     user really wants to do this.
   * Detects when it would be destructive and refuses to run unless
     passed the -f flag, rather than prompting the user for a yes/no
   * Adds the capability to use cp rather than git clone, which I need
     to run this from within Nix.
   * Whitelists CHANGELOG.md which shouldn't get substituted.
tcharding
tcharding previously approved these changes Mar 30, 2023
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, props to your bash-foo.

cd ..
SOURCE_REV=$(git rev-parse HEAD || echo "[unknown revision from $SECP_VENDOR_SECP_REPO]")
rm -rf .git/ || true
popd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we don't redirect this to /dev/null but we do later down?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just forgot :)

-type f \
-print0 | xargs -0 sed -i -r "s/rustsecp256k1_v[0-9]+_[0-9]+_[0-9]+_(.*)([\"\(])/rustsecp256k1_v${SECP_VENDOR_VERSION_CODE}_\1\2/g"

popd > /dev/null

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bash question: Any reason not to exit 0 at the end of the script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We implicitly return the exit value of the last executed command. With set -e this means "nonzero if anything failed, 0 otherwise".

I can add an exit 0 but it'd be purely redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points, thanks. I didn't think of the implications of set -e for the return value.

set -e

# Set default variables
: "${SECP_VENDOR_GIT_ROOT:=$(git rev-parse --show-toplevel)}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL, nice syntax.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I don't know what it means. I copied it from somebody smarter than me, and add the quotes around it because shellcheck claims there's some sort of globbing-related DoS otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After spending a bunch of time reading about this:

  • ${variable} is familiar syntax which evaluates to the value of variable. But you can alternately do ${variable=xyz} which, if variable is unset, will assign xyz to it and return that. ${variable:=xyz} is the same but applies if variable is unset or "null",whatever that means
  • So you can get this "assign if null" behavior but only in variable expansions. So if you just did ${variable:=xyz} on a line by itself, bash would try to evaluate xyz (or whatever) and give you an error message.
  • To avoid this, you pass to : which is a shell bulit-in which is synonymous with true. That is, it's a no-op command that eats its arguments. So you can get the side-effect of evaluating the arguments without having to deal with their pesky values.

Anyway somehow this has become a bash idiom. TIL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers, I had worked out the first colon but not the second,cheers.

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK verified that the default workflow is not affected. Did not check whether all environment variables are updated correctly for the non-default workflows.

We should need to update the README.md in secp-sys directory with updated instructions on running this script. Namely, requiring this to run root of parent crate(not of sys-crate) and not requiring version parameter.

@apoelstra
Copy link
Member Author

@sanket1729 it doesn't matter where you run this script as long as you're in the repo. But yes, I'll update the instructions for the command-line arguments.

Also a couple minor tweaks to the vendoring script itself.
@apoelstra
Copy link
Member Author

Pushed to update README (plus I tweaked the script in a couple small ways, sorry)

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 2ae7ca9

@dpc
Copy link
Contributor

dpc commented Mar 31, 2023

For Nix purposes

Did someone mention Nix? I don't see any nix files in this repo.

@apoelstra
Copy link
Member Author

@dpc the relevant nix file is here https://github.com/apoelstra/local-nix-ci/blob/main/rust-bitcoin.rust-secp256k1.check-pr.nix

I could try to check it into this repo, but I'm still iterating on it extremely quickly, and this whole "exhaustively check huge config matrices" strategy is probably not reasonable for most people to do (or to try to do in CI) because it's so CPU intensive.

Copy link
Collaborator

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the time for review now but wanted to see how big it actually is. One thing caught my attention.

# not specified a revision, just copy the directory rather than using 'git clone'.
# This lets us use non-git repos or dirty source trees as secp sources.
if [ "$SECP_VENDOR_CP_NOT_CLONE" == "yes" ]; then
cp -r "$SECP_VENDOR_SECP_REPO" "$DIR"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth excluding .git somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I could just add a rm -rf "$DIR/.git" || true after the copy. Though I think in practice this won't help much, because anybody using this flag on a git repo is probably going to bring a bunch of noise (build artifacts, etc) over anyway, and those are much harder to exclude because I have no command-line .gitignore parser.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly concerned about speed of copying - large repos can make a significant difference. If you're not going to avoid copying in the first place I guess don't bother.

@dpc
Copy link
Contributor

dpc commented Mar 31, 2023

@dpc the relevant nix file is here https://github.com/apoelstra/local-nix-ci/blob/main/rust-bitcoin.rust-secp256k1.check-pr.nix

Oh wow, I how did I not know that the state of Nixification is so advanced here already.

I can't really tell what's going on in depth in this script and that nix file, but looks like building project with a matrix of features and script is for some vendoring.

I'd highly recommend crane as a base for any Rust code building in Nix. It will take care of cached vendoring of deps, and splitting the building into deps-only (for caching) + complete project, many other things, and can be very easily extended.

I was going to suggest Nixification of rust-bitcoin and related projects, but I wasn't sure how appealing that would be. If you're interested check out Fedimint repo and in particular Setting up dev environment to see the dev experience. Everything in Fedimint is nix-flake based, including dev shell, CI, CI cross-compilation, OCI containers, etc.

I thin if you're willing to accept the gospel of Nix :D, then IMO it leads to unparalleled development experience. I would be happy to help, as I became quite proficient in Nix during the last year of building bigger setups and things with it.

@apoelstra
Copy link
Member Author

apoelstra commented Mar 31, 2023

Thanks for the tip! We could definitely open a discussion in rust-bitcoin about a general NIx strategy for this ecosystem. Though right now the strategy is "Andrew, who personally has a 64-thread machine, is saturating that with build commands that are too involved to replicate on CI or on other developers' machines". I am also vacillating on buying a much more powerful machine for this.

I took a quick look at crane. It took me a while to figure out what it actually did, and it looks like it's strictly less powerful than crate2nix because it just does some hack to split up the build process between "the current crate" and "all its dependencies". But crate2nix actually makes a separate derivation for every dependency, making it easy to reuse deps across entirely different projects as well as to efficiently test with multiple lockfiles.

It would also require I learn Flakes to be able to understand how to use it, and I still haven't understood the benefit there.

Edit It also has this weird language about lockfiles which suggests that crates.io crates always have lockfiles, which AFAICT is just not true, or at least, they're not in my ~/.cargo/src directory. I'm not sure what those lockfiles would even do. What if two dependencies have inconsistent lockfiles? How does crane deal with this? It seems like crane is about building binary projects and it's also centered around producing a nix derivation for the project itself.

@dpc
Copy link
Contributor

dpc commented Mar 31, 2023

@apoelstra

Though right now the strategy is "Andrew, who personally has a 64-thread machine, is saturating that with build commands that are too involved to replicate on CI or on other developers' machines"

For Fedimint we're using https://buildjet.com and https://www.cachix.org . Both cost money, thought not a lot and they are well worth it. Cachix has a free tier for open source project which would be enough for a smaller project like a single library with just a handful of dependencies.

I took a quick look at crane. It took me a while to figure out what it actually did, and it looks like it's strictly less powerful than crate2nix because it just does some hack to split up the build process between "the current crate" and "all its dependencies". But crate2nix actually makes a separate derivation for every dependency, making it easy to reuse deps across entirely different projects as well as to efficiently test with multiple lockfiles.

Yes. crane2nix can cache derivations per-dependency, but that comes at the cost of extra complexity and some Cargo features not being supported (AFAIK), because it basically has to replicate bunch of internal cargo-logic.

crane does less work by minimizing the project first to make it "dependency only", doing cargo build and then cargo build on a full source can re-use the cached ./target that rarely changes. It will cache source code downloads per-crate though. For an application-like project with not a lot of features like Fedimint that's perfect - 90% of caching with, 10% of effort.

For library with a large matrix of features, crane2nix might give an advantage, but then the whole thing matter less because the whole build is much, much smaller. I mean for Fedimint we compile (unfortunately) lots of dependencies with ./target hitting ~8G (compressed to 1.5GB) or so and the CI spents time mostly running tests, and not building the code anyway: https://github.com/fedimint/fedimint/actions/runs/4576237045/jobs/8080097222 , mostly because of caching with cachix.

It would also require I learn Flakes to be able to understand how to use it, and I still haven't understood the benefit there.

Flakes works basically like Cargo.toml + Cargo.lock for Nix (lock everything), with far better support for cross-project composability, input locking, development shells, etc. Personally, I couldn't really be productive with Nix until I went all in Flakes.

Edit It also has this weird language about lockfiles which suggests that crates.io crates always have lockfiles, which AFAICT is just not true,

I don't know if that's true, but that's irrelevant. That's FAQ for someone that wants to build someone's elses project in a Nix derivation. For your own project, you're in control of Cargo.lock, and crane can just use the default one, or you can even overwrite them with a custom one, or whatever else.

What if two dependencies have inconsistent lockfiles?

cargo ignores lock files of dependencies, no matter if they exist or not.

@dpc
Copy link
Contributor

dpc commented Mar 31, 2023

How does crane deal with this? It seems like crane is about building binary projects and it's also centered around producing a nix derivation for the project itself.

I would say that the good way to think about crane is that's a small Nix library for running cargo and re-using stuff it generated in ./target between multiple (possibly all crane-based) derivations. That works for cargo test, cargo audit, cargo build and other commands.

So in Fedimint we have a derivation to build dependencies only of the whole workspace, then re-use the ./target it created in a cargo build --workspace, then re-use ./target it created to run all the cargo test and e2e/integration tests, and then run cargo audit and also cargo doc

@apoelstra
Copy link
Member Author

Ok, thanks! I'll keep it in mind whenever I move over to using flakes ... and if I decide I want to lean into using Cargo's logic more.

So far I haven't had any trouble with crate2nix that I wasn't able to quickly patch (and in one case they accepted a PR so I was able to upstream some of my patches). It is much simpler than cargo and less hostile to dependency reuse and determinsim.

@dpc
Copy link
Contributor

dpc commented Mar 31, 2023

So far I haven't had any trouble with crate2nix that I wasn't able to quickly patch (and in one case they accepted a PR so I was able to upstream some of my patches). It is much simpler than cargo and less hostile to dependency reuse and determinsim.

Yeah, if it is working well, you might stick with it. I wasn't aware that you're already using it. And given that rust-bitcoin ecosystem targets old cargo/Rust, you might never really run into anything that wouldn't be well supported already anyway.

If you ever have it included in the repos here, I'd be eager to try it out and get some better understanding on how it works in practice.

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reACK 2ae7ca9

@apoelstra apoelstra merged commit 493eaf7 into rust-bitcoin:master Mar 31, 2023
26 checks passed
@apoelstra apoelstra deleted the 2023-03--vendoring branch March 31, 2023 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants