Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Exclusion" option for some packages during the repository synchronisation #3469

Open
yunustatli opened this issue Mar 18, 2024 · 7 comments
Labels

Comments

@yunustatli
Copy link

yunustatli commented Mar 18, 2024

Is your feature request related to a problem? Please describe.
Our security team does not allow us to synchronize some packages like "aircrack, pnscan, masscan" from an upstream source. We also have packet filtering for these packages at the firewall level. Repo synchronization is not possible in this case and giving error.

Describe the solution you'd like
A feature to exclude package/packages would help us and also those who have the same problem (I have heard this problem at least from two other colleagues who work in other companies).

At the moment there is the option "Ignore SRPMs" for the rpm-based repositories. As I understand it, the mechanism already exists and should be extended for some specific packages (rpm and deb).

So a new parameter "exclude_packages" can solve the problem.

Describe alternatives you've considered

Additional context

@dkliban dkliban transferred this issue from pulp/pulpcore Mar 19, 2024
@dralley
Copy link
Contributor

dralley commented Mar 19, 2024

The main issue with adding this feature is that it's suuuper easy to accidentally misuse - you would basically need to ensure that only "leaf" packages that nothing else depends on are blacklisted, but you'd need to do it manually since it would be too expensive to calculate on every sync. And there's no way to do that generically, it'd be a per-plugin feature.

SRPMs don't have that particular issue.

With that said I do hear your problem. Perhaps one workaround would be to use on-demand syncs to avoid downloading the packages during the sync itself?

@ggainey
Copy link
Contributor

ggainey commented Mar 20, 2024

If/when we decide to implement, this is prob the place.

Other thoughts:

  • exclude must be incompatible with any of the "mirror" options
  • exclude by Name? By full NEVRA? Both?
  • sync-info willl/must contain the exclude params, to help when something refuses to install due to missing RPMs
  • dependency-hell is a likely result - Pulp will not take ownership of dependency problems caused in repos by such an exclusion process. Docs must be VERY CLEAR that use of such a facility is at the user's risk.

@yunustatli
Copy link
Author

Thank you very much for the quick responses, I really appreciate it.

I have an idea, even if it may seem a little stupid for you. Please forgive me if it seems naive, as I am neither a pulp developer nor a developer at all.

Background:

Currently, after synchronizing repositories, we have the option to remove all desired packages and publish the content views without these packages. As a result, these packages are no longer present in the repository and are also excluded from the repository metadata. Of course, these packages will be synchronized again with the next synchronization because they are missing in the repository.

Perhaps we could utilize the this functionality of Pulp, even though I am not familiar with its inner workings, to remove the packages. If Pulp were to retrieve the exclude package list from the repository settings (when the parameter is not null) and remove the packages before initiating the repository synchronization, it might be a viable solution. However, I am unsure if this approach would be feasible or effective.

I understand there may be concerns regarding dependencies, but as @ggainey mentioned, users can be informed about potential dependency issues in the documentation.

Thank you once again for your time and consideration.

@ggainey
Copy link
Contributor

ggainey commented Mar 20, 2024

The typical workflow in Pulp is exactly this - you sync the upstream to provide the set of content that's available to the Pulp Admin, and then curate that content by deciding what is/isn't Allowed for your users, only Distributing (making public) the curated version(s).

You can accomplish this in a couple of ways. One is to sync a repo, then copy the content-you-want to a second repo and always Distribute the most-recent Publication in that second repo. Another is to sync a repo, then remove content from the resulting RepositoryVersion to make a new RepositoryVersion, and Distribute that specific version.

The problem you described initially, tho, causes that initial sync-from-remote to fail - which stops this cold.

@dralley
Copy link
Contributor

dralley commented Mar 22, 2024

It would not however stop cold if you did an on-demand sync first, because that skips the package download process

@ggainey
Copy link
Contributor

ggainey commented Mar 22, 2024

It would not however stop cold if you did an on-demand sync first, because that skips the package download process

Ah! That's an outstanding observation, good point Daniel.

@quba42
Copy link
Contributor

quba42 commented Apr 4, 2024

Also seems related: #2713

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants