Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 2.82 KB

purge-git-data.md

File metadata and controls

21 lines (13 loc) · 2.82 KB

Purge Git data

Overview

From time to time, a user or GitLabber may push a commit with data they later realize don't want in GitLab.com. The user may delete the branch if able, or rewrite their git history and force push, but other data may still be left dangling. In those cases, for confidentiality or security, waiting for an eventual garbage collection to get rid of such data may not be be sufficient, and the following manual steps may need to be taken:

Checklist

  • Delete Merge Requests. For example, if a security Merge Request was opened on GitLab.com instead of on dev.gitlab.org (as specified in our Security Releases documentation), it's important to ensure it's deleted to avoid out of time disclosure of vulnerabilities. Deleting Merge Requests can only be done by project owners or admins through the UI or the API
  • Delete pipelines. CI/CD pipelines and builds may still retain data such as commit names. This can be done via the API (https://docs.gitlab.com/ee/api/pipelines.html#delete-a-pipeline)
  • Trigger a full Garbage Collection run on the project. Unfortunately, manual housekeeping through the UI doesn't reliably trigger a full GC (see https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6960), so you'll need to run the following in a production rails console, with the relevant project_id: Projects::HousekeepingService.new(Project.find(project_id), :gc).execute
  • Check that the objects are gone. You may use for example git cat-file -e <commit_id> and check the return status.

If a full GC run doesn't delete the commits you can use the following, more aggressive steps by logging in to the file server that contains the repository:

NOTE: DO NOT RUN THE FOLLOWING COMMANDS ON A POOL REPOSITORY (i.e. make sure the repo path doesn't contain /@pools/. See https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/29139/diffs)

If after these steps the objects persist you may be dealing with a pooled repository, and further manual action is required. See https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/development/git_object_deduplication.md for more information. Reach out to the development team for advice.