Add a blog post about privacy preserving ML, with sklearn, federated … #173

bcm-at-zama · 2023-11-23T14:44:33Z

…learning and fully homomomorphic encryption.

This is a blog post that we have discussed with Francois Goupil (@francoisgoupil I imagine), about privacy, and how scikit-learn can be used in a privacy-preserving way with federated learning and fully homomorphic encryption. More precisely, we had agreed on the abstract with Francois, and here is the full version.

Author of the blog is https://github.com/andrei-stoian-zama @andrei-stoian-zama

PS: by the way, I love what you're doing at scikit-learn. It's so easy to use (and still, powerful), your APIs are really well done. Cheers to the team!

welcome · 2023-11-23T14:44:36Z

💖 Thanks for opening this pull request! 💖
scikit-learn community really appreciates your time and effort to contribute to the project.
Please make sure you have read our Contributing Guidelines and filled in our pull request template to the best of your ability.

_posts/2023-11-22-privacy-preserving-scikit-learn.md

…learning and fully homomomorphic encryption.

glemaitre · 2023-12-01T22:58:27Z

FYI: this is on my TODO list to review the blog post. I might be a bit busy next week but I'll do my best.

bcm-at-zama · 2024-01-02T12:35:19Z

Happy new year everyone! Do you know when this blog may be merged, please? Thank you

glemaitre · 2024-01-15T16:47:23Z

So I had a read yesterday on the post. I think that in general this is looking good. We need to figure a couple of things out before the merge:

I would like the blog post close to a notebook and thus get the output. I have to try to build the website locally to be sure of the rendering and check that we can put some static output that look OK.
I think that we should have from the start something allowing to reproduce the experiment via a requirement of environment file. I want to find more time to be able to repeat the experiment. I saw that you already link to a folder on a GitHub repository but I feel the content in the repository diverged a bit too much. I am wondering if we could centralize in some sort of single notebook. I don't think that something need to be done right now. I want first play with the code to have a better grasp of what one would expect when reading and trying to reproduce.
While reading, I thought that the part playing with the flower server was missing a bit of content to show the context to somebody that never heard about federated learning. Somehow, I would like that we make it explicit within the example that the person developing the model should not see the data. I don't have yet a good proposal.

@bcm-at-zama So this is just to let you know that the PR is not dead but we have been a bit busy with the end of the year vacation and the ongoing scikit-learn release.

bcm-at-zama · 2024-01-15T17:17:23Z

Thanks @glemaitre , great to hear. So, if you want some modifications in our blog post (including, us trying to package a bit differently, like with a single notebook), you'll tell @andrei-stoian-zama, who is the real author of the content. Anyway, thanks a lot guys!

bcm-at-zama · 2024-02-05T10:53:50Z

Hello. It would be awesome if we could merge this blog post sooner than later. We are about to present this work at Flower event, https://flower.dev/conf/flower-summit-2024/, for example, and so we're certainly going to make some (social network) promotion before. If we have the blog, we can link to it; else, we'll go for the notebook on our repo, but it's much less readable for new comers.

glemaitre · 2024-02-05T11:25:12Z

If we have the blog, we can link to it; else, we'll go for the notebook on our repo, but it's much less readable for new comers.

I understand your point. Unfortunately, we currently don't have too much bandwidth with other priorities (such as the upcoming 1.4.1 release). I personally don't want to merge something just for the sake of merging it.

adrinjalali

A few thoughts on this post:

We should first talk about what FHE is, a tiny bit, not deep, but better than

FHE is a technology that enables application providers to build cloud-based applications that preserve user privacy

FHE and the cloud don't have much to do with one another. You can use FHE in the cloud, but you could also use it in a local private network between two stakeholders.

We shouldn't link to Zama's example, that example with the should be here. This is not a sponsored post, and we don't do sponsored posts. Links to the libraries and explanation of what's happening is okay and needed of course.
I would like a better explanation of what .compile does. I don't mind the link to the repo since it's BSD, but need more context here.

_posts/2023-11-22-privacy-preserving-scikit-learn.md

koaning · 2024-02-05T12:59:35Z

_posts/2023-11-22-privacy-preserving-scikit-learn.md

+from sklearn.datasets import fetch_openml
+from sklearn.model_selection import train_test_split
+
+mnist_dataset = fetch_openml("mnist_784")


Is there a reason to use MNIST for this demo? It feels like a very general dataset and I'm wondering if there's a better dataset for the point that you're making. Maybe something where it is clear that privacy could be at stake?

this one is for the author, @andrei-stoian-zama

We moved from MNIST that you didn't like to Breast Cancer, which is indeed a better dataset for PPML

Fine with you to resolve this conversation?

koaning · 2024-02-05T13:00:34Z

_posts/2023-11-22-privacy-preserving-scikit-learn.md

+
+model = LogisticRegression(penalty="l2")
+model.fit(X=x_train, y=y_train)
+model.compile(x_train)


I'm curious about the internals of this .compile step? Could you share a diagram with a brief explainer of what might go wrong if one doesn't run that line?

ok we'll explain: if one does not compile, the model stays in the clear, no FHE :)

Is it sufficient now (with in particular an image)? Else, we can link to https://docs.zama.ai/concrete-ml/advanced-topics/compilation#compilation-to-fhe, but it's more complicated and maybe a bit out of scope for this blog: your choice, you tell me

bcm-at-zama · 2024-02-05T13:05:45Z

Thanks for the reviews / comments, we'll soon have a look.

For this one in particular, @adrinjalali:

We shouldn't link to Zama's example, that example with the should be here.

Do you mean we need to add a copy (of the notebook currently on Concrete ML repo) in scikit-learn repository? if yes, yes sure, could you tell us where we need to add the notebook, exactly?

bcm-at-zama · 2024-02-05T13:09:13Z

If we have the blog, we can link to it; else, we'll go for the notebook on our repo, but it's much less readable for new comers.

I understand your point. Unfortunately, we currently don't have too much bandwidth with other priorities (such as the upcoming 1.4.1 release). I personally don't want to merge something just for the sake of merging it.

And I understand your point. It's your repo and your blog, at the end. We had discussed the interest of this blog for both Concrete ML and scikit-learn (and Flower, by the way), a long time ago with @francoisgoupil, but of course, I understand you have other priorities as well

adrinjalali · 2024-02-05T13:44:42Z

Do you mean we need to add a copy (of the notebook currently on Concrete ML repo) in scikit-learn repository? if yes, yes sure, could you tell us where we need to add the notebook, exactly?

Ideally the blogpost is self containing in terms of the content. So the code can be added here.

bcm-at-zama · 2024-02-05T14:12:23Z

Do you mean we need to add a copy (of the notebook currently on Concrete ML repo) in scikit-learn repository? if yes, yes sure, could you tell us where we need to add the notebook, exactly?

Ideally the blogpost is self containing in terms of the content. So the code can be added here.

I see. We'll see what we can do, but I'm afraid it might be a lot of lines. Me, as a reader, I like to have ready to go code to download, when I read a blog, instead of having to concatenate all the code blocks of the blog (+ having to fix what was actually missing)

bcm-at-zama · 2024-02-06T16:59:38Z

New version has been dropped, with changes to address the comments, hopefully. (I'm going to answer to the individual comments)

bcm-at-zama · 2024-02-06T17:10:58Z

Now:

the blog is more self contained, and in particular there is no link to our .ipynb; however, we still link to https://github.com/zama-ai/concrete-ml/tree/main/use_case_examples/federated_learning, we have to remove it?
some explaination (and a diagram) about .compile: sufficient for you? or, we can link to https://docs.zama.ai/concrete-ml/advanced-topics/compilation#compilation-to-fhe if you want
we moved from MNIST that you didn't like to Breast Cancer, which is indeed more related to privacy aspects
tell us if you want to elaborate even more on FHE, or to link to some contents (it may not be at Zama, if you want)

bcm-at-zama · 2024-02-06T17:12:58Z

@adrinjalali, I hope it's better for you now. If not and you prefer us to make a call about that, it's doable, I'm in France

bcm-at-zama · 2024-03-13T18:06:08Z

Hello, we present this work tomorrow at Flower event, https://flower.ai/conf/flower-ai-summit-2024/.

Do you have any ETA for merging this blog post? I understand you have other priorities, but it has really been a long time since we submitted

bcm-at-zama commented Nov 23, 2023

View reviewed changes

_posts/2023-11-22-privacy-preserving-scikit-learn.md Outdated Show resolved Hide resolved

_posts/2023-11-22-privacy-preserving-scikit-learn.md Show resolved Hide resolved

francoisgoupil requested review from ogrisel and glemaitre November 28, 2023 22:10

Add a blog post about privacy preserving ML, with sklearn, federated …

c52e299

…learning and fully homomomorphic encryption.

bcm-at-zama force-pushed the main branch from 8ff00c3 to c52e299 Compare December 1, 2023 13:15

adrinjalali requested changes Feb 5, 2024

View reviewed changes

koaning reviewed Feb 5, 2024

View reviewed changes

_posts/2023-11-22-privacy-preserving-scikit-learn.md Outdated Show resolved Hide resolved

koaning reviewed Feb 5, 2024

View reviewed changes

Update the blog post with reviews.

75f87b7

bcm-at-zama requested a review from adrinjalali February 6, 2024 17:13

bcm-at-zama requested a review from koaning March 13, 2024 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a blog post about privacy preserving ML, with sklearn, federated … #173

Add a blog post about privacy preserving ML, with sklearn, federated … #173

bcm-at-zama commented Nov 23, 2023 •

edited

welcome bot commented Nov 23, 2023

glemaitre commented Dec 1, 2023

bcm-at-zama commented Jan 2, 2024

glemaitre commented Jan 15, 2024

bcm-at-zama commented Jan 15, 2024

bcm-at-zama commented Feb 5, 2024

glemaitre commented Feb 5, 2024

adrinjalali left a comment

koaning Feb 5, 2024

bcm-at-zama Feb 5, 2024

bcm-at-zama Feb 6, 2024

bcm-at-zama Feb 6, 2024

koaning Feb 5, 2024

bcm-at-zama Feb 5, 2024

bcm-at-zama Feb 6, 2024

bcm-at-zama commented Feb 5, 2024

bcm-at-zama commented Feb 5, 2024

adrinjalali commented Feb 5, 2024

bcm-at-zama commented Feb 5, 2024

bcm-at-zama commented Feb 6, 2024

bcm-at-zama commented Feb 6, 2024 •

edited

bcm-at-zama commented Feb 6, 2024

bcm-at-zama commented Mar 13, 2024

Add a blog post about privacy preserving ML, with sklearn, federated … #173

Are you sure you want to change the base?

Add a blog post about privacy preserving ML, with sklearn, federated … #173

Conversation

bcm-at-zama commented Nov 23, 2023 • edited

welcome bot commented Nov 23, 2023

glemaitre commented Dec 1, 2023

bcm-at-zama commented Jan 2, 2024

glemaitre commented Jan 15, 2024

bcm-at-zama commented Jan 15, 2024

bcm-at-zama commented Feb 5, 2024

glemaitre commented Feb 5, 2024

adrinjalali left a comment

Choose a reason for hiding this comment

koaning Feb 5, 2024

Choose a reason for hiding this comment

bcm-at-zama Feb 5, 2024

Choose a reason for hiding this comment

bcm-at-zama Feb 6, 2024

Choose a reason for hiding this comment

bcm-at-zama Feb 6, 2024

Choose a reason for hiding this comment

koaning Feb 5, 2024

Choose a reason for hiding this comment

bcm-at-zama Feb 5, 2024

Choose a reason for hiding this comment

bcm-at-zama Feb 6, 2024

Choose a reason for hiding this comment

bcm-at-zama commented Feb 5, 2024

bcm-at-zama commented Feb 5, 2024

adrinjalali commented Feb 5, 2024

bcm-at-zama commented Feb 5, 2024

bcm-at-zama commented Feb 6, 2024

bcm-at-zama commented Feb 6, 2024 • edited

bcm-at-zama commented Feb 6, 2024

bcm-at-zama commented Mar 13, 2024

bcm-at-zama commented Nov 23, 2023 •

edited

bcm-at-zama commented Feb 6, 2024 •

edited