Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Artifact support ? #2313

Closed
lelegard opened this issue Sep 22, 2022 · 8 comments · Fixed by #2319
Closed

Artifact support ? #2313

lelegard opened this issue Sep 22, 2022 · 8 comments · Fixed by #2319

Comments

@lelegard
Copy link

Hello,

Is there any support for artifacts? Meaning request /repos/{owner}/{repo}/actions/artifacts . I could not find anything and I would like to rewrite an existing script which purges artifacts using module requests into one using module github.

lexa pushed a commit to lexa/PyGithub that referenced this issue Sep 27, 2022
Github workflow runs produce downloadable artifacts, add a class to
represent them.
@lexa
Copy link
Contributor

lexa commented Sep 27, 2022

Hi @lelegard, I've made a PR for adding artifacts support: #2313. I will appreciate if you can check it out.

@lelegard
Copy link
Author

lelegard commented Sep 27, 2022

Hi @lexa

Thanks a lot for the prompt improvement, much appreciated.

I have a few questions. To give you an idea of what is done, the script I would like to update is here: https://github.com/tsduck/tsduck/blob/master/.github/maintenance/purge-artifacts.py

This script still uses the requests module. The other scripts in the same directory were already converted to PyGithub.

The required methods for the artifacts script are (with links to the API docs):

  1. List all artifacts of a repo
  2. Delete an artifact

Quickly browsing through the changes in your PR, I do not see the way to achieve this.

  • There is a method to get the artifacts of a workflow run, but no method get_artifacts() in class Repository, for all artifacts in the repository (requirement 1).
  • I see no method to delete an artifact in class Artifact.
  • Given the way it works in the REST API, it could be appropriate to add a method remove_artifact(int_id) in class Repository, consistent with other "remove" methods in that class (requirement 2).

Thanks for your help.

lexa pushed a commit to lexa/PyGithub that referenced this issue Sep 28, 2022
Github workflow runs produce downloadable artifacts, add a class to
represent them.
lexa pushed a commit to lexa/PyGithub that referenced this issue Sep 28, 2022
Add a method to delete a stored artifact from Github Actions.
lexa pushed a commit to lexa/PyGithub that referenced this issue Sep 28, 2022
Implement calls for listing Github Actions artifacts linked to a
repository.

The test runs on an external repository because PyGithub doesn't have
any artifacts in its CI.
lexa pushed a commit to lexa/PyGithub that referenced this issue Sep 28, 2022
@lexa
Copy link
Contributor

lexa commented Sep 28, 2022

Thanks @lelegard for checking out my PR.

There is a method to get the artifacts of a workflow run, but no method get_artifacts() in class Repository, for all artifacts in the repository (requirement 1).

added in the commit lexa@b07bdf3

I see no method to delete an artifact in class Artifact.

added in the commit lexa@fc4e77f

Unfortunately I can't test it, because I don't have any artifacts to delete. Could you please test if for me?

To run the test you need to amend 36f7fef#diff-290729c59eb604d2187e1aa53a36ad692922094ca021239df18330436df52429R45

to set artifact_id and repo_name to the repo and artifact id you want to remove and then run:

pytest  tests/Artifact.py -k testDelete --record

It should pass. As the result of the run you will get a recording file tests/ReplayData/Artifact.testDelete.txt which i need to include in my PR.

Given the way it works in the REST API, it could be appropriate to add a method remove_artifact(int_id) in class Repository, consistent with other "remove" methods in that class (requirement 2).

I think that is enought to have delete method on the Artifact itself. What is the point of having remove_artifact() in Repository if you need to create Artifact to get its id?

@lelegard
Copy link
Author

Hi @lexa

I will have a look at the update.

I think that is enought to have delete method on the Artifact itself. What is the point of having remove_artifact() in Repository if you need to create Artifact to get its id?

Functionally, you are right. This was just a suggestion.

In principle, it would match the REST API where there is a call on the repo, specifying the id of the artifact to delete. But you may consider that the class structure in PyGithub is designed to hide this functional approach.

In practice, this is a way to simplify the code structure when you have to browse through all artifacts of a repo and delete some of them. There is a pitfall in the Github API about "paginated" lists. When you iterate page by page (say 100 items per page), reading the "next page" is not "starting after the last returned item", this is "page number 2", meaning "starting at item number 101". If you iterate on items without deleting or creating any, it does not make a difference. However, when you list all "items" (artifacts for instance), decide which should be deleted and delete them, then you have a problem. Let's take an example: in the first page (items 1 to 100), you decide to delete 10 of them. Then, moving to "next page" means "start at item 101, after the deletion, meaning that items which were numbered 101 to 110 before deletion are missed because they are now numbered 91 to 100.

The problem is real, I ran into it with the REST API and with PyGithub PaginatedList. It took me some time to understand why my purge script randomly missed some artifacts and I had to run it several times to do a complete purge.

So, in practice, you need a first pass where you read the items to delete, store their id in some list. Then, during a second pass, you delete the items by id. This is what my script does and now it always works at the first run.

I also had the issue, initially, with this script which closes issues with a certain label (and then remove the label). Since the issues are browsed using repo.get_issues(labels = ['close pending']), removing the label of one issue while browsing changes the numbering of issues with that label when the next page is transparently fetched by the iterator at some point.

This is a particularly nasty behaviour that I haven't seen documented anywhere, neither in the Github API doc nor in the PyGithub doc. Once you understand it, it makes sense. But it may take some time to understand it.

If is perfectly possible to build and store a list of instances of Artifact during the first pass and then invoke their delete() method in the second pass. I thought it would be faster and less resource-demanding to store only a list of integer ids during the first pass and then delete them by id using the instance of Repository.

However, this is just the rationale for my suggestion. We can live without it and feel free to ignore it if you do not like it.

@lelegard
Copy link
Author

Hi @lexa,

I have converted my purge-artifact script here. It works perfectly.

Forget about the suggestion to provide a delete_artifact method by id in the Repository class. I used a list of Artifact instances.

Currently, I keep the two versions of the scripts, one using the REST API and one using your updated PyGithub. When do you think the updated version of PyGithub will be available in PyPI?

Thanks again @lexa for the update.

@lexa
Copy link
Contributor

lexa commented Sep 29, 2022

Currently, I keep the two versions of the scripts, one using the REST API and one using your updated PyGithub. When do you think the updated version of PyGithub will be available in PyPI?

No idea, but considering that the last activity on this repo was in december 2021 you shouldn't expect that it will ever reach the upstream.

@lelegard
Copy link
Author

No idea, but considering that the last activity on this repo was in december 2021 you shouldn't expect that it will ever reach the upstream.

Too bad... Any idea on who is managing the project these days ?

lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
Github workflow runs produce downloadable artifacts, add a class to
represent them.
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
Add a method to delete a stored artifact from Github Actions.
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
Implement calls for listing Github Actions artifacts linked to a
repository.

The test runs on an external repository because PyGithub doesn't have
any artifacts in its CI.
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
Github workflow runs produce downloadable artifacts, add a class to
represent them.
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
Add a method to delete a stored artifact from Github Actions.
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
Implement calls for listing Github Actions artifacts linked to a
repository.

The test runs on an external repository because PyGithub doesn't have
any artifacts in its CI.
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
lexa pushed a commit to lexa/PyGithub that referenced this issue Oct 13, 2022
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
Github workflow runs produce downloadable artifacts, add a class to
represent them.
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
Add a method to delete a stored artifact from Github Actions.
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
Implement calls for listing Github Actions artifacts linked to a
repository.

The test runs on an external repository because PyGithub doesn't have
any artifacts in its CI.
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
Github workflow runs produce downloadable artifacts, add a class to
represent them.
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
Add a method to delete a stored artifact from Github Actions.
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
Implement calls for listing Github Actions artifacts linked to a
repository.

The test runs on an external repository because PyGithub doesn't have
any artifacts in its CI.
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
sfdye pushed a commit to lexa/PyGithub that referenced this issue Oct 25, 2022
sfdye pushed a commit that referenced this issue Oct 25, 2022
Co-authored-by: Aleksei Fedotov <aleksei@fedotov.email>
@thburghout
Copy link
Contributor

I noticed that the WorkflowRun.pyi file is still missing the description of get_artifacts().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants