Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP458: Update TUF repository metadata on project index change #15815

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lukpueh
Copy link
Contributor

@lukpueh lukpueh commented Apr 18, 2024

Part 3 in a series of PRs to integrate Repository Service for TUF (RSTUF) with Warehouse for PEP 458 adoption. Previous PR was #15484

This PR adds a new tuf subpackage with functions to talk to the RSTUF API, in order to capture project changes in TUF repository metadata. Relevant Warehouse views -- file upload and project management (remove, yank) -- are updated to call the function.

This PR also updates the so far unused render_simple_detail function for this purpose.


Discussion

There are a few open question, for which I'd appreciate input from folks, who are more familiar with Warehouse.

  • Completeness
    The PR should trigger TUF metadata updates in all relevant views, but someone needs to confirm. Basically, we need to update TUF metadata whenever a project change would result in a different project simple detail file, e.g. on distribution file upload, file removal, or yank. I wonder, if we could register a db hook for those changes, e.g. after_commit, which would also give us better certainty that the db change isn't rolled back.

  • Async task handling
    Metadata update tasks in RSTUF are async. This PR busy polls the RSTUF task status API until the corresponding metadata update task has finished. I'm sure, Warehouse has better ways of dealing with async tasks.

  • Error handling
    This means setting up proper retry mechanism for temporary errors, and figuring out user feedback and recovery for permanent errors.

The simple detail file size is to be included in TUF metadata,
along with its path and hash.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add new TUF subpackage with functions to talk to the RSTUF API,
in order to update TUF repository metadata when a project changes.

The function requires RSTUF to be bootstrapped (see `make inittuf`), and
can additionally be toggled via setting:
- disabled by default
- enabled in dev environment

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Relevant views are "file upload", "remove" (file, release, project),
"yank".

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@@ -64,6 +64,9 @@ TWOFACTORMANDATE_AVAILABLE=true
TWOFACTORMANDATE_ENABLED=true
OIDC_AUDIENCE=pypi

TUF_RSTUF_API_URL="http://rstuf-api"
TUF_ENABLED=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. in RubyGems.org we use RSTUF_API_URL and we enable it just by presence. Not sure if worth to keep it similar.

https://github.com/rubygems/rubygems.org/blob/ad1269b77e3ac83edac6a8f399125ba9fec0280d/config/initializers/rstuf.rb#L3C9-L7


Raises RuntimeError, if task fails.
"""
while True:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this in theory of state not being changed infinite loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is just a stopgap solution. We should definitely do better. See discussion item about async task handling in PR description.

@lukpueh
Copy link
Contributor Author

lukpueh commented May 2, 2024

Discussion

There are a few open question, for which I'd appreciate input from folks, who are more familiar with Warehouse.

...

Any chance to get some feedback from Warehouse maintainers? Also happy to arrange a call to discuss in person. cc @ewdurbin

On a side-note, we released v1.0.0 of the crypto interface used by RSTUF earlier today, and it includes a "VaultSigner", tailored for signing TUF metadata with a HashiCorp Vault key. :)

@ewdurbin
Copy link
Member

ewdurbin commented Jun 3, 2024

  • Completeness
    The PR should trigger TUF metadata updates in all relevant views, but someone needs to confirm. Basically, we need to update TUF metadata whenever a project change would result in a different project simple detail file, e.g. on distribution file upload, file removal, or yank. I wonder, if we could register a db hook for those changes, e.g. after_commit, which would also give us better certainty that the db change isn't rolled back.

It may be worth considering the machinery that handles our CDN cache purges (see warehouse/cache/origin) though it is admittedly opaque and a little convoluted to store "purge keys" on the request for handling after flush. This is woefully undocumented, but has been functionally reliable for a long time.

  • Async task handling
    Metadata update tasks in RSTUF are async. This PR busy polls the RSTUF task status API until the corresponding metadata update task has finished. I'm sure, Warehouse has better ways of dealing with async tasks.

Yes, look for usage of the @tasks.task decorator throughout the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants