Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add local registry overlays #13926

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Add local registry overlays #13926

wants to merge 14 commits into from

Conversation

jneem
Copy link

@jneem jneem commented May 17, 2024

This PR adds (private to cargo internals) support for local registry overlays, in which you can locally pretend to add packages to remote registries; the local packages will have the same source ids as the remote registry that you're overlaying.

There are two ways to set up these overlays: programmatically using GlobalContext::local_overlays and through the __CARGO_TEST_PACKAGE_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS environment variable. You can't set up these overlays with .cargo/config.

The motivation for this is packaging workspaces. When we're packing a workspace, we'd like to be able to pretend (for lockfile generation and verification) that some workspace packages are already published even though they aren't.

@rustbot
Copy link
Collaborator

rustbot commented May 17, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ehuss (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added A-configuration Area: cargo config files and env vars A-testing-cargo-itself Area: cargo's tests S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 17, 2024
if let Some(ref mut p) = self.process_builder {
let env_value = format!("{}={}", url, path);
p.env(
"__CARGO_TEST_PACKAGE_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The term of art for this kind of vulnerability is "Dependency Confusion". As with so many names in computer science, it's better to have consistency then trying to find a better name. This term was coined in the blog post https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610 which might be worth linking in comments about how users could easily miss use this.

@Eh2406
Copy link
Contributor

Eh2406 commented May 20, 2024

Catching Eric up on some of the context, Arlo and I discussed the questions from #10948 (comment) at office hours on Thursday. Adding this API, exclusively for internal use, seemed like the least hacky solution.

Cargo.toml Outdated Show resolved Hide resolved
src/cargo/sources/overlay.rs Outdated Show resolved Hide resolved
src/cargo/sources/overlay.rs Show resolved Hide resolved
src/cargo/util/context/mod.rs Show resolved Hide resolved
/// expose this publicly. However, it is useful for some very specific private
/// things, like locally verifying a bunch of packages at a time before any of
/// them have been published.
pub struct OverlayConfusionAttack<'gctx> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I think OverlaySource is a better fitting name 🙂

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That name definitely makes more sense, yes. But in office hours someone suggested giving it a scary-looking name to make people less likely to misuse it...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like OverlaySource being the suffix. Maybe DependencyConfusionThreatOverlaySource?

jneem and others added 2 commits May 21, 2024 13:16
Co-authored-by: Tor Hovland <55164+torhovland@users.noreply.github.com>
@jneem jneem mentioned this pull request May 22, 2024
4 tasks
@jneem jneem requested a review from Eh2406 May 23, 2024 15:54
@jneem
Copy link
Author

jneem commented May 23, 2024

Thanks for the review! I think this is ready for another look -- we managed to successfully use this for #10948 as discussed.

Comment on lines 64 to 68
// An older version in the local registry doesn't confuse it.
Package::new("baz", "0.0.1")
.alternative(true)
.local(true)
.publish();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is really testing anything because baz is too low of a version for the version requirement

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The motivation here was to test that the overlay is correctly the union of the versions in the overlay and the overlaid registries. (If it were only returning the overlay versions, it would only return the too-old version and fail to resolve)

Copy link
Contributor

@epage epage May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except this is relying on the user to look at the error and know the fine distinction of whether it says baz@0.0.1 exists

We also should probably have a test the other direction, that the overlay source properly shadows the regular source.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'm now testing both directions, and checking the appropriate versions in the stderr.

Comment on lines 111 to 115
Package::new("baz", "0.1.1").dep("baz", "=0.0.1").publish();
Package::new("baz", "0.0.1")
.alternative(true)
.local(true)
.publish();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have the local / selected version depend on the registry version? Thats more like what we'll be dealing with for cargo package

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, but also happy to discuss it further. My understanding was that the behavior of cargo package w.r.t. intra-workspace deps should be exactly the same as if the deps were published before packaging. Resolution should have the option to select the versions that are being packaged simultaneously, but it's also allowed not to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should be the behavior but we aren't testing that situation to confirm it. cargo package will involve a local package depending on an overlayed package which may depend on a registry package.

I guess there are also situations where you depend on a registry package which will depend on the overlaid package (if clap_complete used criterion it would both depend on clap directly and through criterion). So maybe its best to cover both cases.

Copy link
Author

@jneem jneem May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense, I just misunderstood your original comment. I'm now testing both directions.

I guess there are also situations where you depend on a registry package which will depend on the overlaid package

I think the registry package will never require an overlaid package, right? Because the registry package can satisfy all its dependencies from the registry. I added a test where the registry package will pick up an overlaid dependency because it's newer (but compatible with) the registry version of that dependency.

/// affect any newly created `PackageRegistry`s.
///
/// See [`crate::sources::overlay::OverlayConfusionAttack`] for why you shouldn't use this.
pub fn local_overlays(&self) -> RefMut<'_, HashMap<SourceId, PathBuf>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd hope we can avoid putting this on GlobalContext but its hard to say without reviewing the follow up at the same time. Should we defer this to the follow up PR?

My motivation is that it gets messy adding and removing from the Global Context and if we can directly instantiate things, the easier it will be to maintain.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can give a little more context before we decide whether or not to defer. cargo package needs the overlay to take effect during a call to ops::compile. AFAICT there are three possibilities:

  • put the overlay configuration in GlobalContext
  • put the overlay configuration in the ephemeral workspace that cargo package creates
  • have cargo package do something lower-level than ops::compile

The workspace option has the advantage that the overlay configuration doesn't need to be removed (because the workspace is discarded). It seems like it would just require a little work in resolve_ws and resolve_ws_with_opts to get the overlay configuration out of the workspace and insert it into the PackageRegistry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workspace seems the most beneficial. Whichever way we go, still seems like it could be split out into a follow up PR

Comment on lines 173 to 174
let src = id.load(self.gctx, yanked_whitelist)?;
return self.overlay_source(id, src, yanked_whitelist);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only call overlay_source on the result of a load

Should we name it fn load and perform the load and apply the overlay? I worry that the current structure could make it easy to overlook applying the overlay while this pattern would make it the default to go through Self::load and we'd be less likely to introduce bugs to this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we're already in a fn load. I renamed overlay_source to load_and_overlay, and modified it to do the load and the overlay. I think this is an improvement but not sure it it really addresses the concern.

Comment on lines +88 to +101
for (key, path) in gctx.local_overlays().iter() {
let overlay_id = SourceId::for_local_registry(path)?;
base.overlays.insert(*key, overlay_id);
}

if let Ok(overlay) =
gctx.get_env("__CARGO_TEST_DEPENDENCY_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS")
{
let (url, path) = overlay.split_once('=').ok_or(anyhow::anyhow!(
"invalid overlay format. I won't tell you why; you shouldn't be using it anyway"
))?;
let overlay_id = SourceId::for_local_registry(path.as_ref())?;
base.overlays.insert(SourceId::from_url(url)?, overlay_id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we move this hack up into the caller? PackageRegistry is the immediate caller but maybe a layer above that? That would help with the GlobalContext. The main downside is it would make it more annoying to support __CARGO_TEST_DEPENDENCY_CONFUSION_VULNERABILITY_DO_NOT_USE_THIS

Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we moved the overlay configuration into Workspace as discussed below then this could be pushed up into resolve_ws and resolve_ws_with_opts

)
}

fn simple_project() -> Project {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid using this. This reduces "duplication" but that is false code duplication. They have the appearance of being the same but they do not share requirements. Instead it affects the code locality (the ability to read tightly coupled code). I'm trying to double check what these tests are doing and jumping back and forth is making this more difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-configuration Area: cargo config files and env vars A-testing-cargo-itself Area: cargo's tests S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants