Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore bundled protoc binary #725

Closed
hdevalence opened this issue Sep 24, 2022 · 13 comments
Closed

Restore bundled protoc binary #725

hdevalence opened this issue Sep 24, 2022 · 13 comments

Comments

@hdevalence
Copy link

Previously, prost-build "just worked", by including a bundled protoc binary.

Now, it does not, because #657 removed the bundled binary, ostensibly to improve the crate's security posture. However, this change degrades the usability of the crate for a rather unclear security benefit: either users have to compile protoc from source, which requires them to pull even more build tools and source code (generally tools that are worse than cargo), or they need to fetch a precompiled protoc, in which case they're in exactly the same position as they were before, but now the work of determining which protoc binary to use is spread across the entire ecosystem, rather than being in one place (the tonic crate) where it can be done correctly. So an attacker now has many more opportunities to compromise security, because every part of the ecosystem will have their own bespoke way of sourcing protoc.

It would be better to restore the previous usability and security story, rather than forcing downstream users to figure out how to safely get a copy of protoc on their own.

@nc0fr
Copy link

nc0fr commented Sep 24, 2022

I don't think so.

Library users should source the Protocol Buffers compiler on their own since it provides more security, which is far more critical than the "ease of use" you are talking about.
In a lot of organizations, tools are distributed internally and it is expected for builds to use the local version of the compiler (already included in a custom VM image, distribution policy, and internal download proxy...).

Having the library fetching automatically the compiler also creates non-hermetic environments and, by default, breaks *semi-*offline builds (CI/CD pipelines are critical and should avoid downloading a lot of stuff directly).

In addition, attackers have fewer opportunities to compromise the integrity of the downloaded tool due to them being required to compromise the internal policy of an organization (instead of simply having to compromise the download URI).

I can also add that the new behavior is far more idiomatic since developers expect their builds to fail when they are missing the required dependencies. It is possible to forget to configure prost_build not to download the Protocol Buffers compiler which should result in a fail (missing dependency) instead of magically working.

We could argue that transitive dependencies should be distributed alongside the library (which is the role of Cargo/any dependency management system), however, protoc (or any Protocol Buffers compiler) is not a transitive dependency but a direct one!
An example of this behavior can be Cargo. To build your Rust application with Cargo, you need a Rust compiler (probably rustc) and, therefore, you are required to source one on your own (even if utilities like RustUp might vendor everything for you).

In conclusion, I do think the recent changes are great steps in the modularization of prost and provide major security fixes for everyone, and they should not be reverted.

@adamchalmers
Copy link
Contributor

What if there was a helper crate you could depend on, which packages a protobuf compiler? Essentially letting users opt-in to the old behaviour, and it was a separate crate for ease of maintenance.

@LucioFranco
Copy link
Member

I agree with @n1c00o here.

People are open to publishing their own crate which bundles the protoc binaries but I don't want that to be a responsibility of the prost project. Its pretty easy to install protoc on your system or compile it from source.

Going to close this issue, feel free to re-open if you have any more questions/concerns.

@LucioFranco
Copy link
Member

There are options that don't require you to source prost from the PATH, there are vendored crates available and there is even protox which is a pure rust implementation (although not fully complete).

I do agree that the developer experience did take a hit though I disagree with that its actually that bad. I've had to on many occasions had to install stuff to get crates to build, its quite normal. Openssl does this for example (you can include vendored but then you also have more deps).

I also don't buy the "security" concerns, they read more like "policy" corp-speak to me. I welcome being convinced otherwise, but you'll have to convince me that the attack surface area is more substantial than devs having to manually install a little-used binary on production machines, almost certainly without the SHA checks Prost could perform.

When it comes to security I don't think there is even any wiggle room here, being 100% secure in a project is more important than a dev having to do an extra step. Taking a hard stance is extremely important to the health of the ecosystem.

That said, ideally, down the line protox could become a part of prost and you can just do cargo add prost and write a prost_build::compile() that doesn't require anything from the system. At this point, I don't have the time to achieve this right now and its not high on my priority list.

@AThilenius
Copy link

Yep. Finding crates that bundle protoc for you is why I deleted my comment (that you quoted). I still chose to use Rust Protobuf though, their design goals more closely align with devs.

Taking a "hard stance" in security doesn't just create friction, it's less secure. It's well documented that if you force people to change their passwords constantly security gets worse not better, because people get frustrated and sticky-note their password to their computer screen; Rubber Hose Cryptography is a time-tested phenomenon. In this case people are going to copy-paste the first command they find from Google on 'how to install proto' into a shell and run it as sudo. I also still don't see an argument that isn't hand-wavy on how exactly this helps security. All it does is kick the can down the road.

If you just don't want to support downloading protoc that's most certainly within all the maintainer's rights. Blaming it on "security" is a cop out though.

@hdevalence
Copy link
Author

To circle back on this 6mo later, this change was painful for us, but we dealt with that pain by switching to checking in the generated Rust code. While checking in generated code is aesthetically distasteful, in practice it has been much, much more manageable, and we completely avoid all of these problems.

In retrospect we should have done that from the start, and I think a better solution to this rough edge would be to have better documentation about workflows for managing generated Rust code.

@AThilenius
Copy link

checking in the generated Rust code

This is certainly an option, but it negates a lot of the benefits you get from using an IDL in the first place. Protobuf loosely links your projects together into a mono-repo, where the interface is defined in .proto files. If I change an RPC definition, I very much want all the builds that rely on the typing of that RPC to immediately fail, instead of failing at runtime. IMO that's the main benefit of protobuf as an ecosystem, compile time errors instead of runtime crashes. The performance and forward compatibility protobuf offers, IMO, are of much less use to anyone with less than billions of users like Google. I use sqlx for the same reason, a DB schema change will immediately result in failing builds, instead of crashing (or worse) at runtime because a column was the wrong type/missing.

As an aside, for a sane approach to how .proto files and code-gen are managed, checkout https://buf.build/ I think they hit the nail on the head.

@LucioFranco
Copy link
Member

I also still don't see an argument that isn't hand-wavy on how exactly this helps security.

As far as I understand, containing unverified binaries that get stored on something like cratesio and get shipped to every computer is a risk. Its better to build your software from source when you can anyways, it builds more confidence and trust. I could probably find you some security papers on this but I don't have the time.

I still chose to use Rust Protobuf though, their design goals more closely align with devs.

Good I am glad you found something that works for you 😄, prost isn't for everyone.

In retrospect we should have done that from the start, and I think a better solution to this rough edge would be to have better documentation about workflows for managing generated Rust code.

Yes, this was a failure by me running out of time (burnout) and not properly documenting things. I would love to have some help here on telling people what (I think) is the right way to do code gen in rust.

@MJohnson459
Copy link

I'm in the process of updating from 0.9 to 0.12 after putting it off due to this change which has been fairly painful and I am wondering if anyone has any guidance on the best way to do this.

It seems like the reason for removal of protoc can be summarised as a trade-off between increasing the maximum security available at the cost of removing the minimum security. That is, by forcing users to install protoc themselves, they are responsible for whatever they choose which may be more secure, but is also possible to be less secure.

Rather than go into the merits of this trade-off, it seems the key missing information is how to increase the likelihood of a user choosing a secure protoc source vs one that isn't? The original PR mentions "With this change as well the new recommendation for libraries is to commit their generated protobuf." but the readme still recommends using prost-build. Should we be checking in the generated code and if so what are the instructions to do that? If we should still use prost-build, how can we use e.g. protox instead to avoid a system dependency?

Finally, I will also say that a big part of security comes down to trust and by removing the protoc binary for security reasons sends an odd signal. After all we are already trusting the devs of the crate for the rest of the code yet this says "don't trust us to include a secure protoc binary". Saying that including the build system for protoc is a pain to maintain (which seems to have been the original motivation) is much more salving and understandable.

@nc0fr
Copy link

nc0fr commented Nov 15, 2023

You are not supposed to blindly trust crate authors tho, you always need to verify the code you are importing. The old behavior was making it hard because it relied heavily on the binary being directly provided, which is hard to audit and not a deterministic process (thus breaking a lot of minimal security requirements in general).

@nc0fr
Copy link

nc0fr commented Nov 15, 2023

Also you can still use prost-build and not commit the generated code, in the end it all depends on your guidelines and needs (e.g. LSP integration is easier with the artefact existing in the repository).

@AThilenius
Copy link

Woops, had to delete recent post, got myself confused on which crate was what. Sorry about that.

@MJohnson459 You can use https://crates.io/crates/protoc-fetcher I've been using that and it works well.

@miaomiao1992
Copy link

version of 0.8 is very nice! i just tried minutes ago.
i felt regretful for that higher version only works with protoc compiler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants