Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo package --workspace is not very useful #10948

Open
conradludgate opened this issue Aug 8, 2022 · 12 comments · May be fixed by #13947
Open

cargo package --workspace is not very useful #10948

conradludgate opened this issue Aug 8, 2022 · 12 comments · May be fixed by #13947
Labels
A-workspaces Area: workspaces C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-package S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review

Comments

@conradludgate
Copy link

Problem

Let's say you have a workspace

workspace/
  package_a/
    Cargo.toml (name="a", version="0.1.0")
  package_b/
    Cargo.toml (name="a", version="0.1.0", [dependencies] a="0.1.0")
  Cargo.toml (members = ["package_a","package_b"])

If you have already published a to crates.io, then cargo package --workspace will complete successfully.

Now, you update a to 0.1.1, and update b to use that new minimum version. If you try cargo package --workspace again, it will no longer work. You will get an error that 0.1.1 was not found.

This happens because cargo package makes a dummy package for your verification and it strips out all workspace and path information. This means it tries to retrieve the a 0.1.1 from the registry, which it fails to do.

Proposed Solution

If you have a workspace where some package b depends on another package a, where a is both specified by a version AND a path, AND that path is within the workspace members, then the following will happen:

cargo package --workspace

will make the new dummy project to verify the crates, but it will leave in the path information for a within b.

Notes

In my experience, I've never had a workspace where all crates are independent. There's always at least one that depends on another crate. When managing private registries, it's not uncommon to cargo package and upload the .crate file manually.

Currently, it's necessary to package each workspace member individually and upload them in the order they depend.

Ideally, we can run a single

cargo package --workspace

after bumping all versions, then get all of the .crate files and upload them in a batch.

@conradludgate conradludgate added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Aug 8, 2022
@goodSyntax808
Copy link

This sounds very useful!

@goodSyntax808
Copy link

goodSyntax808 commented Aug 11, 2022

I took a look. First time contributor to cargo so just trying to find the relevant parts of the codebase. It looks to me like build_lock() in src/cargo/ops/cargo_package.rs, then TomlManifest::prepare_for_publish() in src/cargo/util/toml/mod.rs might be where the logic of @conradludgate described may be:

This happens because cargo package makes a dummy package for your verification and it strips out all workspace and path information. This means it tries to retrieve the a 0.1.1 from the registry, which it fails to do.

So I believe that here we is where we would need to implement @conradludgate 's proposed solution, and to do so in the order the crates depend on another.

This is after a quick look at the code for the very first time ever, so I may be off base. I'm going to keep looking and try to confirm what I think. Also I'm on Zulip if people want to chat/mentor there

EDIT: After this Zulip discussion, due to the limited bandwidth of cargo team members new features should be accepted first before opening a PR. I am going to work on some issues raised in that thread that already have momentum to try my first contribution

@weihanglo
Copy link
Member

One background knowledge: cargo publish is roughly cargo package + call crates.io API. Therefore, some shared code path between cargo publish and cargo package might affect each other. The requirement of a dependency existence on a registry might come from what cargo publish needs: a .crate file is supposed to be distributable.

One thing a bit weird is that packaging a crate with all verifications but not publishing the path dependencies. If cargo permit that, it make me feel like we try to convince cargo that our build is verified but only locally. As aforementioned a .crate file is supposed to be distributable. This rule may not be true anymore if doing so. Maybe we need to think more on whether enabling packaging with unpublished path dependencies is on the right direction, but I'd say the issue you have does exist and people hit it.

At this time being, a good tool for releasing crates is cargo-release, created by sunny87 and epage. I haven't tried it but I believe it works. Perhaps @epage can give you more experiences on this topic :)

@weihanglo
Copy link
Member

BTW, this is somehow related to #9260 (comment). There are more discussions in the wild regarding a more sequential and smooth packaging/publishing process, though I cannot find them now.

@conradludgate
Copy link
Author

Cargo release is a good tool for workspaces. Unfortunately, we can't use it as-is since we don't have the ability to publish using cargo. However, its something we could fork and fix to work with our own registry

@epage epage added the S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review label May 6, 2024
@torhovland
Copy link
Contributor

torhovland commented May 9, 2024

@jneem and I are looking into this. Some of the aspects we are considering are:

  • It's not so much that cargo package makes a dummy package with paths stripped, but the actual package that will get published has its paths stripped. In order to do verification when some of the (workspace) dependencies are not yet published, we will either have to do verification before we strip the paths, or we will have to create a (dummy) package just for verification, with the paths reinstated.
  • Whether or not to package one crate at a time (like how it works today), or to first prepare a workspace with all paths intact, and then verify each crate. The latter case would be simpler when there are dependency chains across the workspace.
  • Can we assume that all intra-workspace dependencies would be for the latest version, i.e. the path dependency? Or do we need to allow that some crates can depend on a specific version of another workspace crate? It would simplify things if cargo package could just throw an error if all workspace dependencies and current crate versions don't match up. In the case where you do want to depend on older versions, maybe you wouldn't try to publish the workspace as a whole anyway, but would rather publish individual crates?
  • Currently, Cargo does not allow packaging and publishing a crate that depends on another crate unless that dependency is already published. With the change discussed in this issue, it's possible to package both crates and then only publish one of them. Would we need to add safeguards to cargo publish where it confirms that all dependencies exist (basically another round of verification)? If so, we also need to make sure it publishes the crates in the correct order. But this is really a separate concern (multi-package publishing).

If anybody has opinions on this, feel free to chime in.

@epage
Copy link
Contributor

epage commented May 10, 2024

@torhovland taking on a task like this, you might want to coordinate more with the Cargo team. For example, we have Office Hours.

One way this could possibly be split up is

  • Implement this for --no-verify
  • Implement verification

It's not so much that cargo package makes a dummy package with paths stripped, but the actual package that will get published has its paths stripped. In order to do verification when some of the (workspace) dependencies are not yet published, we will either have to do verification before we strip the paths, or we will have to create a (dummy) package just for verification, with the paths reinstated.

For verification, we should be verifying the generated .crate, rather than an intermediate to ensure maximum verification.

I wonder if we can inject patches rather than changing anything about the code. This can all be done in-memory, rather than writing it out to disk.

Whether or not to package one crate at a time (like how it works today), or to first prepare a workspace with all paths intact, and then verify each crate. The latter case would be simpler when there are dependency chains across the workspace.

I have no preference between whether we package+verify one at a time or package all then verify all. We should likely do separate compilation per verify though. #5931 would speed up the compile times for this.

Can we assume that all intra-workspace dependencies would be for the latest version, i.e. the path dependency? Or do we need to allow that some crates can depend on a specific version of another workspace crate? It would simplify things if cargo package could just throw an error if all workspace dependencies and current crate versions don't match up. In the case where you do want to depend on older versions, maybe you wouldn't try to publish the workspace as a whole anyway, but would rather publish individual crates?

There are times to depend on old versions of packages.

  • Dev-dependencies for comparing across versions (usually for major versions)
  • Semver hack (for major versions)
  • You can have access to a registry variant of a package through transitive dependencies, either semver compatible or not.

Currently, Cargo does not allow packaging and publishing a crate that depends on another crate unless that dependency is already published. With the change discussed in this issue, it's possible to package both crates and then only publish one of them. Would we need to add safeguards to cargo publish where it confirms that all dependencies exist (basically another round of verification)? If so, we also need to make sure it publishes the crates in the correct order. But this is really a separate concern (#1169 (comment)).

This isn't relevant to this issue but the follow up one. Let's keep the conversations in each Issue focused and move this over to there.

@torhovland
Copy link
Contributor

Thanks for your input, it's been noted.

@torhovland taking on a task like this, you might want to coordinate more with the Cargo team. For example, we have Office Hours.

Sure, we will show up there.

The rough idea we have so far is:

  1. Determine all intra-workspace path dependencies.
  2. Package each crate using the existing code (without trying to compile/build it, or otherwise resolve online dependencies).
  3. Unpack all packaged crates in a temporary directory, giving each one a random dir name.
  4. Go through all workspace dependencies from step 1 and fix them up in each crate. Can we do this in-memory?
  5. See if this builds.

@torhovland
Copy link
Contributor

Running into a chicken-and-egg problem with the lock files:

When packaging, Cargo strips away all path dependencies and generates lock files by expecting to find dependent packages online. So that elements like this can be put into the lock file:

[[package]]
name = "my-dep"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "780f1cebed1629e4753a1a38a3c72d30b97ec044f0aef68cb26650a3c5cf363c"

But when generating a lock file using just a local path dependency, we do not get any source or checksum. While we could add on a source and a checksum after the fact, we can't really assume that the dependency will end up on crates.io. But we also cannot publish a crate that doesn't specify where to find its dependencies.

So it seems that packaging and publishing are strongly coupled, and there's no way around packaging+publishing one crate at a time.

Unless we introduced something like source = same-as-this-one.

Any ideas?

@torhovland
Copy link
Contributor

A possible solution is to let cargo package have the same --index option that cargo publish has for indicating which registry to use. So you'll use that to indicate which registry you intend to publish to.

@torhovland
Copy link
Contributor

Assuming we could use the --index option suggested above, we still need to figure out how to modify the generated lock files to include a source and a checksum. Ideally, we would like to modify the new_resolve in build_lock() before it gets serialised. But a Resolve seems quite immutable.

We could of course try to modify the lock file after serialisation, but that doesn't seem very nice. The alternative is to modify the resolver code so it can treat a crate as if it was pulled from a registry.

@epage
Copy link
Contributor

epage commented May 15, 2024

We have two problem steps in cargo package --workspace

  • Generating the lockfile for each .crate: path dependencies won't be in the registry yet, so this will fail.
    • Above they talked about forcing path dependencies to be used during the lockfile generation and then patching up the lockfile afterwards with the correct source and checksum
    • In theory, we could do some kind of decoration of crates.io to allow this in-memory...
    • Regardless of the method we use to workaround that, to generate the source, it seems like we need to know where the .crate files are intended for so we'd need --index / --registry flags
  • Verify step for each .crate: the decompressed .crate files will need to depend on each other, rather than their registry, to build
    • We could do in-memory patches
    • Maybe there are alternatives that build on what we did for the lockfile?

@Eh2406 or @arlosi any thoughts or tips on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-workspaces Area: workspaces C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-package S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants