Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getrandom, security and fallback entropy #760

Closed
dhardy opened this issue Mar 23, 2019 · 20 comments
Closed

getrandom, security and fallback entropy #760

dhardy opened this issue Mar 23, 2019 · 20 comments

Comments

@dhardy
Copy link
Member

dhardy commented Mar 23, 2019

Our getrandom crate is close to initial release, which means we can almost start building on it. This means:

  • rand_os::OsRng will become a thin wrapper around getrandom
  • either we have to map getrandom::Error values to rand_core::Error values, or replace the latter completely (suggested in Tracker: Rand 0.7 #715)
  • the rand_os crate will become trivially small

In #715 we have talked about other possible changes:

  • the getrandom crate should handle other sources on embedded platforms
  • we no longer appear to need a fallback in case of OS failure
  • therefore EntropyRng is no longer needed
  • we could replace rand::FromEntropy::from_entropy with rand_core::SeedableRng::from_entropy

Here I would like to add a caveat: we have no proof that a fallback entropy source is not needed. @tarcieri has pointed out in #699 that we cannot be confident of the security of JitterRng, however there is also resistance to merely switching to RDRAND (partly since it is not available on all platforms, and in part based on the belief that we shouldn't need any fallback).

What we do not want to do is remove EntropyRng only to find that we need to add it back again, meanwhile there is strong resistance to the idea of including a fallback within getrandom. This means that for now we should either keep EntropyRng (even if it is merely a wrapper) or plan to put any needed fallback within OsRng.

Additionally, it seems clear that we don't want to move the JitterRng code into rand_core, and we cannot make rand_core depend on rand_jitter circularly. The same applies to the rdrand crate. Thus the only option for including a fallback within rand_core would be a direct RDRAND implementation, which we probably also don't want.

Like it or not, this means that we cannot move OsRng or from_entropy into rand_core unless we can prove that we don't need a fallback.

Now, if we want to pursue a no-fallback entropy source, we could try removing all fallbacks for Rand 0.7 and watch for error reports. However, I'm not sure whether we want to do so (personally I agree with @cpeterso that Rand should "just work").


This means we have to look at other options for now:

  1. Keep rand_os with just OsRng even though this is essentially an empty crate
  2. Move EntropyRng to rand_os (@burdges suggestion)
  3. Drop rand_os and go back to rand::rngs::OsRng

Proposal: 2: move EntropyRng and FromEntropy to rand_os for now, keeping re-exports in rand, add rdrand as a fallback option, and put both jitter and rdrand behind feature flags (disabled by default).

@newpavlov
Copy link
Member

unless we can prove that we don't need a fallback.

I think it's the other way around: it should be proven that fallback is needed. And even if such fallback is needed I believe it should be handled on per-target basis, i.e. personally I am not against fallbacks as part of getrandom, but I am against using fallbacks for all targets, especially for those which already "just work". So addition of fallback can be done in future in a backwards compatible way without return of EntropyRng.

As for rand_os I would prefer to keep it a thin wrapper around getrandom and introduce a separate crate for ThreadRng, which will depend directly on getrandom. BTW maybe it's worth to consider renaming ThreadRng to DefaultRng or something similar?

@dhardy
Copy link
Member Author

dhardy commented Mar 23, 2019

We might be able to add a target-specific fallback to getrandom using RDRAND, but it sounds like several users would be quite unhappy about inclusion of JitterRng, thus your suggestion is limited.

Therefore I think we should stick to my suggestion and not remove EntropyRng for now (though this could potentially happen after proving we don't need it). The reason we must prove we don't need it is because we already have it, and removing then reintroducing it would be totally unnecessary breakage.

ThreadRng will definitely not be included in rand_os either way; it includes far more code (a whole CSPRNG plus reseeding logic). And no, lets not call it DefaultRng (far less descriptive, unclear, and unnecessary breakage).

@newpavlov
Copy link
Member

I am not against leaving EntropyRng in rand, but I think it should be marked as deprecated.

As for ThreadRng, it can use getrandom directly (so it will not depend on EntropyRng) and in case of failure simply continue as usual without reseeding (though it should panic if error happens on initial seeding), so it already should solve problem of "just works" for most users. We should use high-quality CSPRNG for ThreadRng either way, so security-wise it shouldn't be a big issue.

Regarding potential renaming of ThreadRng, the breakage problem can be mostly counteracted with a simple type alias. I think that name should indicate that this RNG is a "default choice" for most users, while the current name has a more focus on how it's implemented. And I think DefaultRng will be more descriptive and clear for new users compared to ThreadRng, while implementation details can be covered in documentation.

@dhardy
Copy link
Member Author

dhardy commented Mar 24, 2019

No, you're not understanding me: we shouldn't be telling users to stop using EntropyRng if there's a chance we might need it again.

For the most part, getrandom will either work or it won't. If it doesn't work, ThreadRng cannot be initialised (we never want insecure random numbers). ThreadRng really doesn't have much to do with this discussion. Renaming it is way off topic.

@dhardy
Copy link
Member Author

dhardy commented Mar 24, 2019

@cpeterso and @tarcieri, thanks for adding a 👍 to the discussion, but it doesn't help a lot. I am not afraid of bowing to peer pressure, but rational argument and a clear plan is needed.

If the plan is to remove EntropyRng and, only if necessary, include a fallback source within getrandom itself, then I think this is acceptable, though of course it means we cannot use CPU jitter as a fallback (#699), which leaves only RDRAND (where available).

Note also that there has not been support for the idea of using secure and insecure "channels" for ThreadRng which means there is a chance the crate will simply fail on some platforms (secure by default, no alternative). If hash-map randomisation uses the getrandom lib, it may just use a constant key when getrandom fails (which is roughly what happens now on a couple of platforms).

@cpeterso
Copy link
Contributor

I have not been following your RNG work closely, so I was reluctant to weigh in. :)

My opinion is that:

  • The getrandom crate should provide a thin, fallible API (i.e. no fallback) to the OS RNG.
  • The rand crate should provide higher-level RNGs that "just work" (i.e. have an internal fallback) and utilities for generating specific random types or distributions.

If someone is concerned about an RNG's internal fallback behavior, then they can use the getrandom crate directly and handle its errors themselves. Or the rand crate could also provide an infallible API that allows the caller to specify which fallback RNG to use. (Of course, that assumes the fallback RNG is itself infallible and doesn't need a secondary fallback!)

I think hash-map randomization should be infallible by default, but perhaps include a fallible option for users that care.

@dhardy
Copy link
Member Author

dhardy commented Mar 25, 2019

Of course, that assumes the fallback RNG is itself infallible

There are no secure, universally available infallible options, that much is for sure. The only way of making hash-map randomisation truly infallible is to allow use of a constant key — though arguably it's still better to use a low-entropy key (system clock) and even better to use a high-entropy-but-low-trust key such as from CPU jitter, if secure options are unavailable.

Representing so many levels of entropy quality is the part we haven't worked out yet — most of the people here are of the opinion that thread_rng should require high quality entropy or fail, which doesn't correlate with your view that Rand should "just work".

I have suggested before that we reintroduce weak_rng, however that idea has little support. Thus, even when weaker sources of entropy are available, we appear to have no way of making use of them.

@dhardy dhardy changed the title getrandom interface, rand_core and fallback getrandom, security and fallback entropy Mar 29, 2019
@dhardy
Copy link
Member Author

dhardy commented Mar 29, 2019

This thread now seems to be mostly a discussion of entropy security and reliability, so lets make it that.

ThreadRng also gets brought up here since it is one of the more interesting usage points, so lets list out the possibilities (including stupid ones):

  1. Security is best-effort, falling back to insecure entropy everywhere — not acceptable
  2. Security is application-level configurable, with the option to allow insecure entropy — likely not acceptable due to presence of a big foot gun
  3. No insecure fallbacks in Rand libs — possibly acceptable; may require ugly hacks in widely deployed apps which don't require this level of security
  4. Multiple "security" levels provided by Rand

WIth options 1-2 discounted we should explore the other two options further. There are several potential uses to consider:

  1. Cryptography and betting games — generally this requires high-quality entropy or fails
  2. Best-effort cryptography — sometimes accepting "reasonable" security is better than failing if delivery is more important than security; this must be an application-level choice
  3. Randomisation to avoid collisions (e.g. hash maps) — secure randomness is useful, but often less important than service functionality; note that security is more important on servers than end-user devices though also that servers are less likely to have issues supplying high-quality entropy
  4. Casual uses — security unimportant but at least avoiding completely repetitive behaviour is useful

I think we can dismiss 2 as a concern — it appears to involve degrading the security of the entire device when necessary to fit a custom usage. If this is deployed on an OS, that OS can be configured to achieve this goal; on embedded devices we already plan to allow custom entropy sources. It does however mean that attempting to periodically improve security (e.g. what ReseedingRng does) may be useful.

To cover these use-cases, possibly two kinds of entropy source are sufficient: secure only, best effort.

Then there's thread_rng, which from a quick search seems to be used by a bunch of different things (many games, randomised algorithms, unit tests, generating passwords, even authentication keys). Clearly security may be important.

It should be noted that hash-map randomisation, probably the thing most needing a fallback, does not use thread_rng.


I think this boils down to three questions:

  1. Do we want secure entropy fallbacks to be used by thread_rng? I agree that these do not need to be added except where we have evidence for their necessity, and think the best place for such fallbacks is within the getrandom crate (or possibly a low-level wrapper like getentropy).
  2. How should "best effort only" entropy collection (e.g. as used by hash-map randomisation) work? The simplest approach would be to use getrandom and ignore failures. Is this good enough for general usage, including on servers? Do we want a derived crate for best_effort_entropy?
  3. Is it useful to have a "best effort security" equivalent of thread_rng? Currently I see no demand for this, though it's possible we might want to add this in the future.

In fact, regardless of these questions, we don't need EntropyRng, and can move OsRng into rand_core.

@newpavlov
Copy link
Member

newpavlov commented Mar 29, 2019

Regarding the need for fallbacks: I would like to highlight that Rust std does not use any fallbacks during HashMap initialization process and simply panics if entropy retrieval has failed: unix, windows. The only relevant issue reported is this one, which arises from incorrect kernel behavior. So I think in reality it's reasonable to assume that OS RNG always works and we do not have to go to great lengths as to design rand around fallbacks.

we <..> can move OsRng into rand_core

Why would we do that? Doing so will break "RNG abstraction traits" proposition of the crate. Optional dependency of rand_core on getrandom should be enough, no? And keeping a thin rand_os crate should not be a huge maintenance burden.

@vks
Copy link
Collaborator

vks commented Mar 29, 2019

I'm not convinced we need fallbacks. It seems like a very niche use case, given that even std does not have fallbacks for HashMap. Applications can always choose to initialize a StdRng themselves and to fall back to whatever works for them (rand_jitter, hardcoded value, rdrand, time...) if the OS randomness does not work. I don't think Rand has to provide that.

What do you think about releasing 0.7 without fallbacks? We could see how many issues occur, and add fallbacks if they are really required.

@dhardy
Copy link
Member Author

dhardy commented Mar 29, 2019

So I think in reality it's reasonable to assume that OS RNG always works

Sorry if it wasn't clear; this was actually my conclusion at the end of this morning's post. However, it makes sense to have a plan for handling them in case we do need them (especially since we are discussing the removal of the current mechanism for handling fallbacks).

Why would we do that? Doing so will break "RNG abstraction traits" proposition of the crate.

Doing so doesn't break anything, and if we are already making getrandom an optional dependency (for from_entropy) then I don't see any value to keeping the extra crate.

@newpavlov
Copy link
Member

newpavlov commented Mar 29, 2019

The idea behind rand_core is to define interfaces shared between RNGs. I don't see how OsRng fits here and frankly I think it will look clearly out of place. If you are so inclined to deprecate rand_os (though I do not understand why) a better place for OsRng will be rand, but certainly not rand_core.

@dhardy
Copy link
Member Author

dhardy commented Mar 29, 2019

rand_core already contains significantly more than just interfaces: the BlockRng code, helpers in the impls module, seed_from_u64 (which includes a complete PRNG), and now we're talking about adding from_entropy.

OsRng requires no more than a simple wrapper.

We should feature-gate this (disabled by default, but enabled automatically in Rand), thus it won't be extra code in libraries only using core functionality, but of course it would be included anyway by most Rand users.

@newpavlov
Copy link
Member

rand_core already contains significantly more than just interfaces

As you've said yourself some time ago including block stuff into rand_core may have been a mistake.

it would be included anyway by most Rand users.

Considering that ThreadRng and SeedableRng impls will pull entropy directly from getrandom I doubt OsRng will be used directly or indirectly by most rand users.

@dhardy
Copy link
Member Author

dhardy commented Mar 31, 2019

Interesting point, though it would require some redesign: ThreadRng is built on ReseedingRng which needs an RNG for entropy source (or at least, something that can be passed to from_rng). This would be unnecessary if the reseeding code was directly embedded in ThreadRng, but that probably doesn't make sense unless ReseedingRng is deprecated.

There are not many other users of ReseedingRng, but I did find two: servo, gotham (both roughly equivalent to ThreadRng but without thread-locality). We could still embed the functionality in ThreadRng, but ReseedingRng is a nice piece of functionality (including fork protection).

@dhardy
Copy link
Member Author

dhardy commented Apr 2, 2019

If OsRng is no more than a shim around getrandom, there are consequences:

  • we can make it a zero-sized type, constructible with just OsRng
  • from_entropy() isn't needed when you can just type from_rng(OsRng).unwrap()
  • there is no actual cost to having multiple implementations (other than that passing it into/out of functions may not work, but there's no point doing this anyway)
  • we can keep rand_os and have an implementation within Rand to avoid the crate import

This means we have to commit to OsRng remaining a zero-sized type, but since we are able to build good getrandom implementations for even wasm_bindgen (storing handles in TLS) I don't think this is a problem.

@dhardy
Copy link
Member Author

dhardy commented Apr 2, 2019

Note: EntropyRng is perhaps a better name than OsRng, but the latter is more widely used (search rand_os), so it's probably better not to rename.

@vks
Copy link
Collaborator

vks commented Apr 2, 2019

This is a nice way to get rid of FromEntropy. Note however that from_entropy will panic while getrandom will block if there is not enough entropy, so this is a slight change in behavior.

we can keep rand_os and have an implementation within Rand to avoid the crate import

I'm not sure we want that code duplication. On the other hand, it is very little code.

Note: EntropyRng is perhaps a better name than OsRng, but the latter is more widely used (search rand_os), so it's probably better not to rename.

I actually prefer OsRng, because it makes it clearer where the randomness is coming from. EntropyRng is a bit like RandomRng and does not say much.

@newpavlov
Copy link
Member

newpavlov commented Apr 2, 2019

we can keep rand_os and have an implementation within Rand to avoid the crate import

It may be somewhat confusing to have two different types withe the absolutely same name and implementation, but if you don't want rand_os in rand dependency tree, well, then it's not a bad solution. I doubt it will create any problems for users in practice.

@dhardy
Copy link
Member Author

dhardy commented Apr 10, 2019

Status:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants