Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] rencrypt all files (only if the decrypted file changed) #135

Open
badele opened this issue Apr 19, 2022 · 6 comments
Open

[Feature] rencrypt all files (only if the decrypted file changed) #135

badele opened this issue Apr 19, 2022 · 6 comments

Comments

@badele
Copy link

badele commented Apr 19, 2022

I use this workflow (i use a command from my Makefile (see bellow) )

  • git checkout/pull project
  • unlock all files with make secret-unlock
  • work on my project
  • lock all files with make secret-lock secret-check
  • commit all my works

But If the decrypted files not changed and if i use make secret-lock the content of encrypted files has changed, is it possible to store de SHA1 in the encrypted file (or tempory hidden file during decryption step) and reencrypt only if SHA1 changed and restore encrypted file git state if SHA1 is identical

secret-lock: requirements-check ## lock all files from repository
    agebox reencrypt

secret-unlock: requirements-check ## unlock all files from repository
    agebox decrypt --all

secret-check: requirements-check ## Verify all secrets is in agebox vault
    agebox validate --no-decrypt
@jpluscplusm
Copy link

@badele Heya! I have exactly this problem as well :-)

The root cause of this is, I believe, because agebox uses an encryption mechanism (or at least, a mode of operating that mechanism) that doesn't produce repeatable, deterministic encryption. This is generally a good thing, as it defeats some known-plaintext attacks. But it has the downside that every time a file's contents is re-encrypted, the cypher text changes - and because git sees the file as binary, it treats it as 100% modified.

The git-crypt tool notes that it works around this problem, but in the absence of similar crypto nuance at the code level (which is well beyond my skills!), I think there's a possible fix we could implement, here:

I figure that because @slok has given us #143, it's now viable to write a script that:

  • reads the decryption passphrase from the user, once;
  • set up an envvar containing the passphrase (PPHRASE)
  • loops over the output of agebox validate (or /perhaps/ the contents of the agebox config file?):
    • for each Invalid secret: \"foo\" decrypted secret exists, grab the secret-id= value
    • we know the plaintext is present at path $SECRET_ID
    • the cypher text is deleted from disk, and git knows that it was at $SECRET_ID.agebox
    • create some temp file .agebox: tmp123456.agebox
    • use some git magic to cat the encrypted, in-git-storage version of the secret into the temp file:
      • git cat-file -p $(git ls-files --stage $SECRET_ID.agebox | awk '{print $2}') > tmp123456.agebox
    • compare the current plaintext on disk with the result of de-crypting the temp file with agebox cat:
      • diff -q $SECRET_ID <(agebox cat -i keys --passphrase-env PPHRASE temp.file.agebox) >/dev/null
    • if the diff exits with code 0, there were no differences: git restore $SECRET_ID.agebox and delete $SECRET_ID
    • if the diff exits with code >=1, there were differences; leave the plaintext file alone
  • Now, after the loop ends, the only plaintexts on disk are the modified ones: it's now "safe" to run agebox reencrypt

Ok, this is a bit hacky! It sticks a bunch of things together, and could probably use some refinement :-)

But I think that it's at least viable, and by writing it we'll probably come up with some feature requests for @slok, if there're ways to make the Developer Experience nicer :-)

@badele
Copy link
Author

badele commented May 9, 2022

Thanks @jpluscplusm for your response, i think another solution can be developped

When i use agebox decrypt --all it store the sha1 files in .ageboxreg.yml file (section file_ids) and when i use agebox reencrypt it re-encrypt only file changed (verified with previous sha1).

I think this feature can be develop easyly ? (sorry, i not a go developer)

@aca
Copy link

aca commented May 23, 2022

I wrote quick small script to achieve this. (not sure I did it right way)

#!/usr/bin/env elvish

cd (or (e:git rev-parse --show-toplevel 2>/dev/null) (echo "."))
cat .ageboxreg.yml | y2j | jq '.file_ids[]' -r | from-lines | each { |x| nop ?(ls $x 2>/dev/null) } | from-lines | each { |x| agebox encrypt $x }

@Threnklyn
Copy link

Maybe agebox should not delete the original encrypted .agebox files and compare saved hashes of files with the unencrypted file.
So if a file is modified the hash will indicate it and it can be reencrypted. If another file is not modified the hashes will be the same and we just delete the unencrypted file and the file.agebox will be untouched.
the hashes can be saved in our .ageboxreg.yml and we only need to commit changed files into git

@jpluscplusm
Copy link

@Threnklyn I think there are a couple of nuances to work through, conceptually:

  1. If an unencrypted hash of the plain text is ever stored in a file that is intended to be stored in version control, then some real thought has to go into potential data leakage scenarios. For example, think about very short plaintexts ("true" vs "false") in files that are /named/ in a way which, when combined with the contents of the file, leaks too much to an attacker. Having a deterministic hash visible in plaintext changes the threat model that either has to be mitigated, or the user has to be warned not to use the tool in a vulnerable way.

  2. Is the stored hash of the plain text or the cypher text? If the former, then the comparison operation needs access to the private key. If the latter, then the sequence of operations after noticing a difference needs to be very carefully planned, and special attention given to the fact that agebox has a stated aim of being VCS-agnostic.

@Threnklyn
Copy link

We don't need to store the hash of the plain file in git. Save it temporarily after decryption until the file is encrypted again.
So the hashes are only available while the file is unencrypted and the encrypted agebox file is only written to if the hash and plain text file mismatch.

You are absolutely right regarding saving the hash in git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants