This tool preserves user deleted repos into long term storage. It only preverses the source code in Git but not the Github data (e.g. pull requests, issues, repo settings).
In an enterprise settings, the source code belongs to the company and not the users (employees). Github Enterprise however allows users to delete repos at will resulting into destruction of intellectual property.
This tool is intended to be running on the server holding the snapshots produced by
ghe-backup
distributed as part of
GitHub Enterprise Backup Utils.
$ git clone git@github.com:ghe-tools/github-enterprise-deleted-repos-archival.git
$ cd github-enterprise-deleted-repos-archival
$ npm install
Copy config/default-example.json to config/default.json and edit the values as needed
$ cp config/default-example.json config/default.json
$ vi config/default.json
To archive recently deleted repositories run
$ node src/archive.js --repo=recent
When running for the first time, we want to pick up all repos marked as for deletion.
$ node src/archive.js --repo=all
Use --help
switch for more details
To run the smoke test suite:
$ npm test
To obtain code coverage
$ npm run cover
Coverage results are stored in ./coverage
folder
To recover a deleted repo:
- Create a new Github repo first
- Find the repo in the archive of deleted repos, then untar it
- Import the source code (just like creating a new repo).
It is best to have cron jobs to run it every week. For example if this app is installed in /opt/ghe/archival
, the entry for the crontab would look like this:
02 02 * * 7 root /opt/ghe/archival/check-process.sh archive.js || NODE_PATH=/opt/ghe/archival/node_modules NODE_CONFIG_DIR=/opt/ghe/archival/config /path-to-nodejs/bin/node /opt/ghe/archival/src/archive.js
It runs the archive operation everyday Sunday at 2:02am if it is not currently running
dir.snapshots
: directory where backup-utils stores snapshots. We look for thecurrent
directory underdir.snapshots
dir.archive
: directory where deleted repositories will be copied to for archivalgithub.host
: hostname instance of the instance to process deleted repositoriesgithub.port
: SSH administrative port. Don't change it; it is always 122github.timeout
: number of seconds to timeout the SSH connectiongithub.username
: SSH usernamegithub.search-query-all
: fetch all repositories that are marked for deletiongithub.search-query-recent
: fetch all repositories that are marked for deletion from the past N days (this number is in the query itself)ssh-private-key.file
: SSH private key used to connect to GHEssh-private-key.pass-phrase
: Pass phrase of SSH private keyemail.sender
: email address to use when sending alert emails.email.recipients
: list of email addresses to send the alert to.log.dir
: log directory location.log.level
: level of details going into the log file.log.retention
: number of log files to keep as they rotate (once a week).