Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native obs-signd support for signing PULP hosted content #2986

Open
praiskup opened this issue Mar 1, 2023 · 24 comments · May be fixed by #3425
Open

Native obs-signd support for signing PULP hosted content #2986

praiskup opened this issue Mar 1, 2023 · 24 comments · May be fixed by #3425
Assignees
Labels
COPR Features desired by the COPR Team Feature

Comments

@praiskup
Copy link

praiskup commented Mar 1, 2023

It would be nice to have a native support for obs-signd in PULP. The obs-sign
client/server signing method is useful specifically because it is very networking economical.
Calculating the crypto checksums is separated from the signing logic. While the
checksum (I/O intensive task for large RPMs) is done client-side the signatures are
calculated server-side.

It would be nice if this was well integrated into PULP signature mechanisms,
i.e. when RPM is uploaded to PULP (but not yet e.g. to s3 storage) - the RPM was
in-place signed via obs-signd.

HINT: Not entirely sure if strictly related to pulp_rpm, or rather pulpcore.

@praiskup praiskup changed the title a native obs-signd support for signing PULP hosted content Native obs-signd support for signing PULP hosted content Mar 1, 2023
@dralley
Copy link
Contributor

dralley commented Mar 2, 2023

We should probably try to make this functionality generic enough that it could potentially be used with https://github.com/TommyLike/signatrust and not specifically obs-signd

@ipanova ipanova added the COPR Features desired by the COPR Team label Mar 9, 2023
@TommyLike
Copy link

We should probably try to make this functionality generic enough that it could potentially be used with https://github.com/TommyLike/signatrust and not specifically obs-signd

Thanks, the project has been moved here: https://gitee.com/openeuler/signatrust

@TommyLike
Copy link

@praiskup With a quick code search and document reading, I think pulp rpm only signs the repo metadata itself, within this background it would be quite easy to support obs-sign or signatrust in pulp rpm, we can use custom
script file which connect the pulp framework to obs sign or signatrust client.

@dralley
Copy link
Contributor

dralley commented Oct 5, 2023

With a quick code search and document reading, I think pulp rpm only signs the repo metadata itself

Correct, that's the current state of things

@dralley
Copy link
Contributor

dralley commented Jan 2, 2024

So, a quick summary of some of the more tricky requirements / problems here:

  • The simplest approach would be to sign packages at upload time, and not support signing them afterwards. Before we commit to an implementation we should be absolutely sure that there isn't a sufficient need for an after-upload signing use case that would justify the problems it would case.
  • Supporting after-upload signing would require potentially re-downloading the whole artifact in order to produce a new signed artifact. It would likely use more disk space as you would end up with both a signed and unsigned package in the artifact store.
  • The signing service ought to be generic enough to be usable with obs-signd or signatrust. Our existing signing service approach has the right idea, since it uses a user-provided script. It was designed around the notion of detached signatures though, so it may be awkward to use in its current state with attached signatures as with an RPM package. The name of the output, for instance, is SIGNATURE_PATH
  • This is one more feature (alongside domains) that would be incompatible with the notion of having a package not uploaded directly to a repository, and so we ought to take another look at phasing out that API (if possible).

@TommyLike
Copy link

Let's moving forward

@pedro-psb pedro-psb self-assigned this Feb 5, 2024
@pedro-psb
Copy link
Member

pedro-psb commented Feb 5, 2024

Hello @praiskup, I started working on this and I need some more info.
First, it would be helpful to get @dralley considerations reviewed, especially about the on-upload-only workflow.
Also, please do share any other considerations about your upload workflow, the APIs you'll use, or anything else you consider relevant.

I plan to work on a more general rpm signing solution based on rpm --addsign / --checksig. When that works and is properly tested, it should be ready to be extended by any service-specific script (as it is with metadata signing).

Edit: does that approach seem reasonable to you?

@pedro-psb pedro-psb linked a pull request Feb 7, 2024 that will close this issue
12 tasks
@praiskup
Copy link
Author

praiskup commented Feb 8, 2024

Hello folks, thank you for the update here.

So, a quick summary of some of the more tricky requirements / problems here:

  • The simplest approach would be to sign packages at upload time, and not
    support signing them afterwards. Before we commit to an implementation we
    should be absolutely sure that there isn't a sufficient need for an
    after-upload signing use case that would justify the problems it would case.

I'm afraid there will be a need for after-upload (re)signing, at least
from time to time. We did a mass resign before, and resigning is quite common for
Koji. Also, for example there's a Copr related problem with rpm &&
"prolonged" signing keys

(we have to solve this somehow, and we are not quite sure what to do right
now, re-signing is one of the options)

We also have a script for re-signing

  • Supporting after-upload signing would require potentially
    re-downloading the whole artifact in order to produce a new signed
    artifact. It would likely use more disk space as you would end up
    with both a signed and unsigned package in the artifact store.

/me nods. If we had to keep signing done on Copr (early stages), download
and re-upload would be needed anyway.

  • The signing service ought to be generic enough to be usable with
    obs-signd or signatrust. Our existing signing service approach has the
    right idea, since it uses a user-provided script. It was designed
    around the notion of detached signatures though, so it may be awkward to
    use in its current state with attached signatures as with an RPM
    package. The name of the output, for instance, is SIGNATURE_PATH

    • This is one more feature (alongside domains) that would be
      incompatible with the notion of having a package not uploaded
      directly to a repository, and so we ought to take another look at
      phasing out that API (if possible).

Detached RPM signatures would be nice RFE.. But I agree that
upload-but-sign-first feature would be very useful.

Hello @praiskup, I started working on this and I need some more info.
First, it would be helpful to get @dralley considerations reviewed,
especially about the on-upload-only workflow. Also, please do share any
other considerations about your upload workflow, the APIs you'll use, or
anything else you consider relevant.

Well, on-upload-only seems like a good start, considering that re-upload
re-signs the package. I don't think I have some API requirements in mind; we
typically sign the bulk of RPMs by the script (one build produces multiple
RPMs, but obs-sign handles this one-by-one) but I don't think it makes
sense to work on a bulk-upload API...

I plan to work on a more general rpm signing solution based on rpm --addsign / --checksig. When that works and is properly tested, it
should be ready to be extended by any service-specific script (as it is
with metadata signing).

Edit: does that approach seem reasonable to you?

Seems fine. Note that maintaining the keyring is not a trivial thing.
We have a big database of signatures, one per project namespace. And
since projects last long, we keep prolonging those... We will have to
somehow pair the repo with the appropriate signature.

@pedro-psb
Copy link
Member

I've discussed that feature with the RPM team today and for now, I'll proceed with the on-upload-only support, as there is much to discuss about properly handling the signing of existing content.

Here's a sample workflow that the implementation will enable:

  1. Register a signing-service with the new RpmPackageSigningService class for each key that you want to be available for signing. The same obj-sign script will be used.
  2. Add each registered signing-service to the appropriate Repository through a new field (w/ create/update API). This service will be used for signing the packages uploaded to that repository.
  3. Upload a Package to a Repository with an option to turn on package-signing. The signing service associated with the repository will be used.
  4. As a convenience, the repository may have an option to enable auto-sign for every package uploaded to that repository, so its not necessary to opt-in package-signing per-upload for that repo.

@praiskup
Copy link
Author

praiskup commented Feb 9, 2024

Sounds good, just ad 1., note we have ~122000 keys ATM, so the DB needs to be fairly optimal.

Is there some queue of pending tasks of this kind? So the particular load of tasks is going to be handled by some background workers in a precise order so the rest of the service (API) is not DOSed?

@pedro-psb
Copy link
Member

note we have ~122000 keys ATM, so the DB needs to be fairly optimal.

We will take care of setting this up so it's efficient at access, thanks for pointing that out.

Is there some queue of pending tasks of this kind? So the particular load of tasks is going to be handled by some background workers in a precise order so the rest of the service (API) is not DOSed?

The signing-service registration is not called via the API, it's done via django commands (see here), so it it's not handled by workers but it shouldn't interfere with the API capacity.
As for the upload API, it is handled by Pulp's tasking system. As long the signing service is explicitly registered before, I believe there won't be a problem with the ordering.

@dralley
Copy link
Contributor

dralley commented Feb 12, 2024

The signing service creation in its current state can never be made publicly accessible as it involves handing Pulp a shell script to execute. That's why it's only available as an admin command and isn't part of the API - allowing users to pass arbitrary shell scripts would be a horrible idea.

Key management might be a concern, I agree.

@pedro-psb
Copy link
Member

Regarding the process of calling the SigningService on uploaded packages, my earlier implementation did that on the API level.

There are some pros and cons of doing it. I'll summarize the discussion we had with pulp team (thanks Daniel, Ina, Grant and Matthias).

Early conclusions

We should do signing on workers, if we have to choose one.

Even though it's not ideal for Storage Usage to have the extra round trips to the backend storage, API Service availability and Security best practices criteria are better satisfied using workers.

Trade-offs

The tradeoffs are complex and may vary on different scenarios. Here are the key takes I got from our discussions:

Aspect Sign on API service Sign on workers
Security Signing Secrets may be reachable from the internet if deployed on API service Signing Secrets can be proper isolated if api/workers are deployed separately
Cost for API service Wait for each package to sign [1] + upload to the backend storage Wait the upload to backend storage only
Storage Usage Less storage traffic [2] Traffic overhead of extra package roundtrips and possibly temporary duplicate usage of storage [3]

[1] - Might not be too bad for both big/small rpms, assuming >V3 signing specs is used, which does not require full payload reading.
[2] - The unsigned package is never uploaded to the backend storage.
[3] - The unsigned package is uploaded to storage, downloaded by the workers, then the signed one is uploaded and the unsigned one deleted. The removal of the unsigned package may not be immediate for a more generic implementation (using Artifacts, which will use orphan cleaned-up).

Alternative

I though about supporting both, as one or the other could be best for different scenarios (package size, deployment setup, etc), but it might not be a good idea to expose such low-level implementation details and have to support them both.

Reference

  • We have a file transfer overhead when using workers because its necessary to upload the file to the backend storage and the worker should download it again. See Making Temporary Files Available to Tasks.

@praiskup
Copy link
Author

praiskup commented Mar 13, 2024

Thank you for the analysis!

I'm not sure what would be the penalty for [1]... while calling /bin/sign from obs-signd on API service would take some time, most of the heavy work is delegated to the obs-sign server (and the API service process would sleep most of the time). We can expect several tens of concurrent signature processes (peaks in the current Copr deployment).

Can you elaborate on Security pros/cons? edit: it's there:
With obs-sign, the secrets are elsewhere, not on the API service.

@ipanova
Copy link
Member

ipanova commented Mar 14, 2024

Thanks for the cost benefit analyses. I lean towards signing in workers even through the storage overhead. A good rule of thumb is to not withhold api with anything that is few seconds longer and offload that to the tasks.

@ggainey
Copy link
Contributor

ggainey commented Mar 14, 2024

Contact with the API workers is impacted by things like idle-connection-timeouts as well - if the call takes "too long", nginx/apache can decide to drop the connection. This leans me also twoards "do it in a task worker" instead of "inside" the API call directly.

@dralley
Copy link
Contributor

dralley commented Mar 14, 2024

Great writeup @pedro-psb, everything you've written sounds accurate

One additional note is that if we're talking about S3, additional traffic isn't just an efficiency concern but does come with a monetary cost.

@mdellweg
Copy link
Member

To me, using signing services only in workers is a must. And this is even more important given the fact that Pulp will never get a grip on any private key material. It must be able to operate using hardware crypto devices. At that point, i think we can safely assume that only the worker processes have access to such a device.

We may even need to delegate singing tasks to accordingly equipped worker nodes.

@dralley
Copy link
Contributor

dralley commented Mar 14, 2024

I agree, architecturally speaking I can't really support doing signing outside of a worker.


Side note: we should do some at-scale upload performance testing at the rate of 100-or-so packages a second, roundabouts the maximum we would expect COPR to experience. I don't think we did any testing with upload during the first round of scale testing for COPR.

@pedro-psb
Copy link
Member

Moving to another topic:

Recently a draft PR was opened by @mdellweg to offer basic keyring management capabilities to Pulp.

This should provide some well-crafted models for storing Pubkeys, Keyrings, and Signatures, among others. I prefer waiting to use it because it would help to overcome some of the limitations (of the current feature draft) that I've listed below.

Obs: In all those cases, I could implement some workaround, but it wouldn't be too generic and we would probably move to this general PGP models later.

SigningService validation overhead

  • Add a signing service and a key is an atomic operation. To add N keys we need to add N signing services.
  • Each SigningSerivce registration triggers a validation, so pulp knows it is a valid signing script.
  • 100k registrations will trigger 99.999 unnecessary script validations, as usually the same adapter script would be used for every key.

Signature Management

  • We do not store package signature information and there is not a standard way to do this.
  • Handling this might not be very important for the "sign on upload" feature alone, but eventually this information will be valuable (e.g, to avoid having to download a package from storage to check if its signed).

Key Registration

  • Signing service can only be registered via pulpcore-manager cli utility (by admins with access to the host), and not via REST API. The reason is that would be unsafe to allow script submission to the system.
  • Because signing-service and key are coupled, this admin-only restriction applies to key registration as well.
  • I'm not aware of performance trade-offs between using pulpcore-manager vs REST API, but possibly using the API would be more efficient.

@pedro-psb
Copy link
Member

pedro-psb commented Apr 11, 2024

Related discussion about the "Temporary File Roundtrip" problem, thats it, the problem of the extra traffic required for sending uploaded files to workers: pulp/pulpcore#3827

@bersace
Copy link

bersace commented Apr 12, 2024

Hi,

We would like to resign package from remote as well as after upload. We may also want to resign because we updated the signing key.

I suggest an /addsign API that accepts an RPM unit href and a signing_service as input. It returns a task id. The tasks return a new unit_href.

Upload could rely transparently on this task if a signing_service is given along the file.

What do you think of this ?

Regards

@pedro-psb
Copy link
Member

Hey, thanks for sharing the idea. It mostly matches with my plans and with our discussions about this.
A small difference is that I intent to change how we use signing services, so the same service can be used for different keys.

Also, there are some decision to be made about how we'll handle the old unsigned rpm. Just to give a brief context of the problem here: content is immutable, so when we sign something inside Pulp, we are really creating another signed rpm in storage. Also, because of this immutability, we can't directly remove the old unsigned one. So we'll either unlink the unsigned object automatically (so it can be orphan cleaned-up) or we'll let that to the user.

@bersace
Copy link

bersace commented Apr 17, 2024

I agree. I'm aware that signing creates a new artifacts. I intend to upload an rpm or get it from a remote, sign it and then add the signed artifact to the target repository.

pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue May 22, 2024
What this does:
- Create RpmPackageSigningService
- Create RpmTool utility
- Add fields to Repository model
- Add branch on Package upload to sign the package with the associated
  SigningService and fingerprint.
- Add docs

Closes pulp#2986
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue May 22, 2024
What this does:
- Create RpmPackageSigningService
- Create RpmTool utility
- Add fields to Repository model
- Add branch on Package upload to sign the package with the associated
  SigningService and fingerprint.
- Add docs

Closes pulp#2986
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue May 22, 2024
What this does:
- Create RpmPackageSigningService
- Create RpmTool utility
- Add fields to Repository model
- Add branch on Package upload to sign the package with the associated
  SigningService and fingerprint.
- Add docs

Closes pulp#2986
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue May 23, 2024
What this does:
- Create RpmPackageSigningService
- Create RpmTool utility
- Add fields to Repository model
- Add branch on Package upload to sign the package with the associated
  SigningService and fingerprint.
- Add docs

Closes pulp#2986
pedro-psb added a commit to pedro-psb/pulp_rpm that referenced this issue May 29, 2024
What this does:
- Create RpmPackageSigningService
- Create RpmTool utility
- Add fields to Repository model
- Add branch on Package upload to sign the package with the associated
  SigningService and fingerprint.
- Add docs

Closes pulp#2986
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
COPR Features desired by the COPR Team Feature
Projects
Status: Needs review
Development

Successfully merging a pull request may close this issue.

8 participants