New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid surrogates when generating char
using Standard distribution
#519
Conversation
Since the old version only rejected 0.2% of samples, that doesn't explain the performance increase. Simpler logic (no branching other than the subtraction, which might not even be implemented as a branch) may explain the improvement. Looks good anyway! @pitdicker? |
Might be faster if you sample from 0 to |
Seems to give the same performance.
|
Good job! I think the Benchmark before:
With this PR:
With
Not sure why the different range is a bit slower. It has to do one addition less in the range code. On the other hand the check and compensation for characters in the gap and above ( Can you add a comment that this is investigated, and a range with |
Somewhat funny: On Reddit there is a there is a unhappy discussion on the amount of unsafe code (and questionable use) in actix-web, yet we start adding more unsafe code 😄. |
Hmm. My take from a quick glance at that discussion is (a) that In this case the code is quite easy to understand, so not a big issue I think. |
I agree, and am all for it in this case. |
Sorry, didn't see the existing one. I've removed it.
Done |
Thank you! |
One way to reduce the concern about unsafe code would be to add a |
@@ -44,15 +44,21 @@ pub struct Alphanumeric; | |||
impl Distribution<char> for Standard { | |||
#[inline] | |||
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> char { | |||
let range = Uniform::new(0u32, 0x11_0000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make this new(0u32, char::MAX as u32)
? More explicit about what we're doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is being removed. But I don't think this is a good idea; you're off by 1 (should be inclusive) and if char::MAX
were ever to change we don't know now whether we should use the whole range (minus the existing gap). So better just to use local constants as in the current implementation.
I was thinking |
Debug assertions aren't run with |
Yeah. A |
I wonder if it would be possible with some ugly trick like transmuting to I guess adding |
This probably isn't performance critical but I thought it seemed wasteful to have a loop (that could theoretically go on forever) just to generate a char. The new version also seems faster when I benchmarked it.