Native obs-signd support for signing PULP hosted content #2986

praiskup · 2023-03-01T15:36:01Z

It would be nice to have a native support for obs-signd in PULP. The obs-sign
client/server signing method is useful specifically because it is very networking economical.
Calculating the crypto checksums is separated from the signing logic. While the
checksum (I/O intensive task for large RPMs) is done client-side the signatures are
calculated server-side.

It would be nice if this was well integrated into PULP signature mechanisms,
i.e. when RPM is uploaded to PULP (but not yet e.g. to s3 storage) - the RPM was
in-place signed via obs-signd.

HINT: Not entirely sure if strictly related to pulp_rpm, or rather pulpcore.

dralley · 2023-03-02T17:24:16Z

We should probably try to make this functionality generic enough that it could potentially be used with https://github.com/TommyLike/signatrust and not specifically obs-signd

TommyLike · 2023-10-05T08:23:01Z

We should probably try to make this functionality generic enough that it could potentially be used with https://github.com/TommyLike/signatrust and not specifically obs-signd

Thanks, the project has been moved here: https://gitee.com/openeuler/signatrust

TommyLike · 2023-10-05T08:48:38Z

@praiskup With a quick code search and document reading, I think pulp rpm only signs the repo metadata itself, within this background it would be quite easy to support obs-sign or signatrust in pulp rpm, we can use custom
script file which connect the pulp framework to obs sign or signatrust client.

dralley · 2023-10-05T12:48:13Z

With a quick code search and document reading, I think pulp rpm only signs the repo metadata itself

Correct, that's the current state of things

dralley · 2024-01-02T19:03:35Z

So, a quick summary of some of the more tricky requirements / problems here:

The simplest approach would be to sign packages at upload time, and not support signing them afterwards. Before we commit to an implementation we should be absolutely sure that there isn't a sufficient need for an after-upload signing use case that would justify the problems it would case.
Supporting after-upload signing would require potentially re-downloading the whole artifact in order to produce a new signed artifact. It would likely use more disk space as you would end up with both a signed and unsigned package in the artifact store.
The signing service ought to be generic enough to be usable with obs-signd or signatrust. Our existing signing service approach has the right idea, since it uses a user-provided script. It was designed around the notion of detached signatures though, so it may be awkward to use in its current state with attached signatures as with an RPM package. The name of the output, for instance, is SIGNATURE_PATH
This is one more feature (alongside domains) that would be incompatible with the notion of having a package not uploaded directly to a repository, and so we ought to take another look at phasing out that API (if possible).

TommyLike · 2024-01-03T00:47:45Z

Let's moving forward

pedro-psb · 2024-02-05T18:11:56Z

Hello @praiskup, I started working on this and I need some more info.
First, it would be helpful to get @dralley considerations reviewed, especially about the on-upload-only workflow.
Also, please do share any other considerations about your upload workflow, the APIs you'll use, or anything else you consider relevant.

I plan to work on a more general rpm signing solution based on rpm --addsign / --checksig. When that works and is properly tested, it should be ready to be extended by any service-specific script (as it is with metadata signing).

Edit: does that approach seem reasonable to you?

praiskup · 2024-02-08T08:25:46Z

Hello folks, thank you for the update here.

So, a quick summary of some of the more tricky requirements / problems here:

The simplest approach would be to sign packages at upload time, and not
support signing them afterwards. Before we commit to an implementation we
should be absolutely sure that there isn't a sufficient need for an
after-upload signing use case that would justify the problems it would case.

I'm afraid there will be a need for after-upload (re)signing, at least
from time to time. We did a mass resign before, and resigning is quite common for
Koji. Also, for example there's a Copr related problem with rpm &&
"prolonged" signing keys
(we have to solve this somehow, and we are not quite sure what to do right
now, re-signing is one of the options)

We also have a script for re-signing

Supporting after-upload signing would require potentially
re-downloading the whole artifact in order to produce a new signed
artifact. It would likely use more disk space as you would end up
with both a signed and unsigned package in the artifact store.

/me nods. If we had to keep signing done on Copr (early stages), download
and re-upload would be needed anyway.

The signing service ought to be generic enough to be usable with
obs-signd or signatrust. Our existing signing service approach has the
right idea, since it uses a user-provided script. It was designed
around the notion of detached signatures though, so it may be awkward to
use in its current state with attached signatures as with an RPM
package. The name of the output, for instance, is SIGNATURE_PATH

This is one more feature (alongside domains) that would be
incompatible with the notion of having a package not uploaded
directly to a repository, and so we ought to take another look at
phasing out that API (if possible).

Detached RPM signatures would be nice RFE.. But I agree that
upload-but-sign-first feature would be very useful.

Hello @praiskup, I started working on this and I need some more info.
First, it would be helpful to get @dralley considerations reviewed,
especially about the on-upload-only workflow. Also, please do share any
other considerations about your upload workflow, the APIs you'll use, or
anything else you consider relevant.

Well, on-upload-only seems like a good start, considering that re-upload
re-signs the package. I don't think I have some API requirements in mind; we
typically sign the bulk of RPMs by the script (one build produces multiple
RPMs, but obs-sign handles this one-by-one) but I don't think it makes
sense to work on a bulk-upload API...

I plan to work on a more general rpm signing solution based on rpm --addsign / --checksig. When that works and is properly tested, it
should be ready to be extended by any service-specific script (as it is
with metadata signing).

Edit: does that approach seem reasonable to you?

Seems fine. Note that maintaining the keyring is not a trivial thing.
We have a big database of signatures, one per project namespace. And
since projects last long, we keep prolonging those... We will have to
somehow pair the repo with the appropriate signature.

pedro-psb · 2024-02-08T19:50:58Z

I've discussed that feature with the RPM team today and for now, I'll proceed with the on-upload-only support, as there is much to discuss about properly handling the signing of existing content.

Here's a sample workflow that the implementation will enable:

Register a signing-service with the new RpmPackageSigningService class for each key that you want to be available for signing. The same obj-sign script will be used.
Add each registered signing-service to the appropriate Repository through a new field (w/ create/update API). This service will be used for signing the packages uploaded to that repository.
Upload a Package to a Repository with an option to turn on package-signing. The signing service associated with the repository will be used.
As a convenience, the repository may have an option to enable auto-sign for every package uploaded to that repository, so its not necessary to opt-in package-signing per-upload for that repo.

praiskup · 2024-02-09T07:04:51Z

Sounds good, just ad 1., note we have ~122000 keys ATM, so the DB needs to be fairly optimal.

Is there some queue of pending tasks of this kind? So the particular load of tasks is going to be handled by some background workers in a precise order so the rest of the service (API) is not DOSed?

pedro-psb · 2024-02-12T18:04:19Z

note we have ~122000 keys ATM, so the DB needs to be fairly optimal.

We will take care of setting this up so it's efficient at access, thanks for pointing that out.

Is there some queue of pending tasks of this kind? So the particular load of tasks is going to be handled by some background workers in a precise order so the rest of the service (API) is not DOSed?

The signing-service registration is not called via the API, it's done via django commands (see here), so it it's not handled by workers but it shouldn't interfere with the API capacity.
As for the upload API, it is handled by Pulp's tasking system. As long the signing service is explicitly registered before, I believe there won't be a problem with the ordering.

dralley · 2024-02-12T18:11:44Z

The signing service creation in its current state can never be made publicly accessible as it involves handing Pulp a shell script to execute. That's why it's only available as an admin command and isn't part of the API - allowing users to pass arbitrary shell scripts would be a horrible idea.

Key management might be a concern, I agree.

pedro-psb · 2024-03-13T18:39:54Z

Regarding the process of calling the SigningService on uploaded packages, my earlier implementation did that on the API level.

There are some pros and cons of doing it. I'll summarize the discussion we had with pulp team (thanks Daniel, Ina, Grant and Matthias).

Early conclusions

We should do signing on workers, if we have to choose one.

Even though it's not ideal for Storage Usage to have the extra round trips to the backend storage, API Service availability and Security best practices criteria are better satisfied using workers.

Trade-offs

The tradeoffs are complex and may vary on different scenarios. Here are the key takes I got from our discussions:

Aspect	Sign on API service	Sign on workers
Security	Signing Secrets may be reachable from the internet if deployed on API service	Signing Secrets can be proper isolated if api/workers are deployed separately
Cost for API service	Wait for each package to sign [1] + upload to the backend storage	Wait the upload to backend storage only
Storage Usage	Less storage traffic [2]	Traffic overhead of extra package roundtrips and possibly temporary duplicate usage of storage [3]

[1] - Might not be too bad for both big/small rpms, assuming >V3 signing specs is used, which does not require full payload reading.
[2] - The unsigned package is never uploaded to the backend storage.
[3] - The unsigned package is uploaded to storage, downloaded by the workers, then the signed one is uploaded and the unsigned one deleted. The removal of the unsigned package may not be immediate for a more generic implementation (using Artifacts, which will use orphan cleaned-up).

Alternative

I though about supporting both, as one or the other could be best for different scenarios (package size, deployment setup, etc), but it might not be a good idea to expose such low-level implementation details and have to support them both.

Reference

We have a file transfer overhead when using workers because its necessary to upload the file to the backend storage and the worker should download it again. See Making Temporary Files Available to Tasks.

praiskup · 2024-03-13T19:53:19Z

Thank you for the analysis!

I'm not sure what would be the penalty for [1]... while calling /bin/sign from obs-signd on API service would take some time, most of the heavy work is delegated to the obs-sign server (and the API service process would sleep most of the time). We can expect several tens of concurrent signature processes (peaks in the current Copr deployment).

~~Can you elaborate on Security pros/cons?~~ edit: it's there:
With obs-sign, the secrets are elsewhere, not on the API service.

ipanova · 2024-03-14T12:26:20Z

Thanks for the cost benefit analyses. I lean towards signing in workers even through the storage overhead. A good rule of thumb is to not withhold api with anything that is few seconds longer and offload that to the tasks.

ggainey · 2024-03-14T15:31:19Z

Contact with the API workers is impacted by things like idle-connection-timeouts as well - if the call takes "too long", nginx/apache can decide to drop the connection. This leans me also twoards "do it in a task worker" instead of "inside" the API call directly.

dralley · 2024-03-14T15:39:20Z

Great writeup @pedro-psb, everything you've written sounds accurate

One additional note is that if we're talking about S3, additional traffic isn't just an efficiency concern but does come with a monetary cost.

mdellweg · 2024-03-14T15:45:15Z

To me, using signing services only in workers is a must. And this is even more important given the fact that Pulp will never get a grip on any private key material. It must be able to operate using hardware crypto devices. At that point, i think we can safely assume that only the worker processes have access to such a device.

We may even need to delegate singing tasks to accordingly equipped worker nodes.

dralley · 2024-03-14T15:47:37Z

I agree, architecturally speaking I can't really support doing signing outside of a worker.

Side note: we should do some at-scale upload performance testing at the rate of 100-or-so packages a second, roundabouts the maximum we would expect COPR to experience. I don't think we did any testing with upload during the first round of scale testing for COPR.

pedro-psb · 2024-03-28T18:59:50Z

Moving to another topic:

Recently a draft PR was opened by @mdellweg to offer basic keyring management capabilities to Pulp.

This should provide some well-crafted models for storing Pubkeys, Keyrings, and Signatures, among others. I prefer waiting to use it because it would help to overcome some of the limitations (of the current feature draft) that I've listed below.

Obs: In all those cases, I could implement some workaround, but it wouldn't be too generic and we would probably move to this general PGP models later.

SigningService validation overhead

Add a signing service and a key is an atomic operation. To add N keys we need to add N signing services.
Each SigningSerivce registration triggers a validation, so pulp knows it is a valid signing script.
100k registrations will trigger 99.999 unnecessary script validations, as usually the same adapter script would be used for every key.

Signature Management

We do not store package signature information and there is not a standard way to do this.
Handling this might not be very important for the "sign on upload" feature alone, but eventually this information will be valuable (e.g, to avoid having to download a package from storage to check if its signed).

Key Registration

Signing service can only be registered via pulpcore-manager cli utility (by admins with access to the host), and not via REST API. The reason is that would be unsafe to allow script submission to the system.
Because signing-service and key are coupled, this admin-only restriction applies to key registration as well.
I'm not aware of performance trade-offs between using pulpcore-manager vs REST API, but possibly using the API would be more efficient.

pedro-psb · 2024-04-11T18:18:53Z

Related discussion about the "Temporary File Roundtrip" problem, thats it, the problem of the extra traffic required for sending uploaded files to workers: pulp/pulpcore#3827

bersace · 2024-04-12T12:22:38Z

Hi,

We would like to resign package from remote as well as after upload. We may also want to resign because we updated the signing key.

I suggest an /addsign API that accepts an RPM unit href and a signing_service as input. It returns a task id. The tasks return a new unit_href.

Upload could rely transparently on this task if a signing_service is given along the file.

What do you think of this ?

Regards

pedro-psb · 2024-04-16T20:53:12Z

Hey, thanks for sharing the idea. It mostly matches with my plans and with our discussions about this.
A small difference is that I intent to change how we use signing services, so the same service can be used for different keys.

Also, there are some decision to be made about how we'll handle the old unsigned rpm. Just to give a brief context of the problem here: content is immutable, so when we sign something inside Pulp, we are really creating another signed rpm in storage. Also, because of this immutability, we can't directly remove the old unsigned one. So we'll either unlink the unsigned object automatically (so it can be orphan cleaned-up) or we'll let that to the user.

bersace · 2024-04-17T08:00:17Z

I agree. I'm aware that signing creates a new artifacts. I intend to upload an rpm or get it from a remote, sign it and then add the signed artifact to the target repository.

What this does: - Create RpmPackageSigningService - Create RpmTool utility - Add fields to Repository model - Add branch on Package upload to sign the package with the associated SigningService and fingerprint. - Add docs Closes pulp#2986

praiskup added Feature Triage-Needed labels Mar 1, 2023

praiskup changed the title ~~a native obs-signd support for signing PULP hosted content~~ Native obs-signd support for signing PULP hosted content Mar 1, 2023

This was referenced Mar 1, 2023

Allow to set gpgKey value for repository #2903

Closed

Use Pulp as the RPM repository backend fedora-copr/copr#2533

Open

ipanova added the COPR Features desired by the COPR Team label Mar 9, 2023

dralley removed the Triage-Needed label Mar 23, 2023

pedro-psb self-assigned this Feb 5, 2024

pedro-psb linked a pull request Feb 7, 2024 that will close this issue

Add support for Rpm Package signing #3425

Open

12 tasks

pedro-psb mentioned this issue Feb 9, 2024

Add rpm-sign to the BaseImage pulp/pulp-oci-images#593

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native obs-signd support for signing PULP hosted content #2986

Native obs-signd support for signing PULP hosted content #2986

praiskup commented Mar 1, 2023

dralley commented Mar 2, 2023

TommyLike commented Oct 5, 2023

TommyLike commented Oct 5, 2023

dralley commented Oct 5, 2023

dralley commented Jan 2, 2024

TommyLike commented Jan 3, 2024

pedro-psb commented Feb 5, 2024 •

edited

praiskup commented Feb 8, 2024

pedro-psb commented Feb 8, 2024

praiskup commented Feb 9, 2024

pedro-psb commented Feb 12, 2024

dralley commented Feb 12, 2024

pedro-psb commented Mar 13, 2024

praiskup commented Mar 13, 2024 •

edited

ipanova commented Mar 14, 2024

ggainey commented Mar 14, 2024

dralley commented Mar 14, 2024

mdellweg commented Mar 14, 2024

dralley commented Mar 14, 2024

pedro-psb commented Mar 28, 2024

pedro-psb commented Apr 11, 2024 •

edited

bersace commented Apr 12, 2024 •

edited

pedro-psb commented Apr 16, 2024

bersace commented Apr 17, 2024

Native obs-signd support for signing PULP hosted content #2986

Native obs-signd support for signing PULP hosted content #2986

Comments

praiskup commented Mar 1, 2023

dralley commented Mar 2, 2023

TommyLike commented Oct 5, 2023

TommyLike commented Oct 5, 2023

dralley commented Oct 5, 2023

dralley commented Jan 2, 2024

TommyLike commented Jan 3, 2024

pedro-psb commented Feb 5, 2024 • edited

praiskup commented Feb 8, 2024

pedro-psb commented Feb 8, 2024

praiskup commented Feb 9, 2024

pedro-psb commented Feb 12, 2024

dralley commented Feb 12, 2024

pedro-psb commented Mar 13, 2024

Early conclusions

Trade-offs

Alternative

Reference

praiskup commented Mar 13, 2024 • edited

ipanova commented Mar 14, 2024

ggainey commented Mar 14, 2024

dralley commented Mar 14, 2024

mdellweg commented Mar 14, 2024

dralley commented Mar 14, 2024

pedro-psb commented Mar 28, 2024

SigningService validation overhead

Signature Management

Key Registration

pedro-psb commented Apr 11, 2024 • edited

bersace commented Apr 12, 2024 • edited

pedro-psb commented Apr 16, 2024

bersace commented Apr 17, 2024

pedro-psb commented Feb 5, 2024 •

edited

praiskup commented Mar 13, 2024 •

edited

pedro-psb commented Apr 11, 2024 •

edited

bersace commented Apr 12, 2024 •

edited