New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing Hashicorp Vault in a transparent manner #625
Comments
Anybody? |
Is it possible to inherit from one of the existing storage backends and then override the necessary methods? I guess I'm wondering if a new mechanism is required at all. Perhaps it would be simpler to add an empty method to act as a hook for child classes. It will be easier for me to analyze the proposal with some code to look at. |
How about a Proxy-based design where you build a Vault based storage that just hands off the files to S3 (or any of the storages)? My question would be how and where are you going to handle the decryption? For the most part, the storages return a url for the file to be accessed in the normal file handling flow. The file is not streamed from S3 thru the Django server. Would the URL actually be from the Vault server? As a side note, if you are not doing client-side encryption I am curious how this is better than what Amazon or Google provide. Google offers the ability to totally manage the keys. https://cloud.google.com/storage/docs/encryption/ |
@jdufresne I was thinking about actually using Django's signals, but perhaps you're right and I should just submit a few patches and then we can discuss if they are feasible. @sww314 So, your approach is toset About your second question: both the encryption and decryption are done inside Hashicorp Vault. The Vault acts like a black box. You send it data and tell it the name of key (a key you previously generated inside the Vault) you want to encrypt the data and the Vault encrypts it. Same goes for the other way around. About being better that AWS/Google's own solutions: it's not better or worse, it's different. |
@alexandernst Yes exactly. Are still planning on using S3 (or similar?) Or are you going to store in your own EC2 instance. From the first description, I thought it was proxy to encrypt/decrypt files as they enter/leave S3, but the comment about hosting your own made me wonder. Side note: You might not want to use as the DEFAULT_FILE_STORAGE, because if you use |
@sww314 I'm ok hosting my encrypted files in S3, as long as they are encrypted before entering S3. Good point about the |
I just created a new PR with a skeleton implementation of my proposal. Can we move the discussion there? @sww314 @jdufresne |
Sorry, I forgot to mention that the PR is #627 |
I'm currently discussing with other devs at my company using Hashicorp Vault for protecting the files we upload to S3. The way this Vault works makes it perfect for our needs, and probably for anybody else looking for a reliable secure file storage.
The problem is that there is no simple way to implement it on top of
django-storages
without patching the particular classes instorages.backend
that we use (S3 in our case).We'd like to discuss and contribute a serie of patches that would allow a way in which
django-storages
could provide some sort of callback mechanism that would allow to "intercept" files that are to be written/read and modify them (read as encrypt/decrypt). Think about is as thepre_*
andpost_*
signals in Django.Background info:
So, what is that Hashicorp thing and why is it any different from S3?
The main difference is that S3's encryption protects the files only from cold attacks (aka: somebody going inside AWS facilities and getting away with your disk, physically). Hashicorp on the other side, encrypts files before they are uploaded to S3.
Ok, so this is just like any other encryption that can be used with Django, right?
No, the encryption (and decryption) is done inside Vault, instead of being handled by Python/Django, which means that the decryption key can't be stolen and that the attacker must have constant access to the Vault in order to decrypt all files.
How does that Vault work?
First you create a secret key using the Transit engine, then you use the Vault as a SaaS, meaning, you call 2 API endpoints: one to encrypt data and the other to decrypt data.
So how would all that integrate inside
django-storage
?The idea is to provide some sort of mechanism that would allow all storage implementation (Azure, GCP, S3, etc...) to call an optional callback/signal when files are about to be written or read and let that signal handle the encryption/decryption.
That callback/signal would then make a GET/POST request to Vault and make sure the files that are being processed are safely encrypted/decrypted.
I hope that this makes sense, but if it doesn't, I'll happily explain any questions that you might have. @jschneier @jdufresne
The text was updated successfully, but these errors were encountered: