Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid surrogates when generating char using Standard distribution #519

Merged
merged 2 commits into from Jun 21, 2018

Conversation

Pazzaz
Copy link
Contributor

@Pazzaz Pazzaz commented Jun 20, 2018

This probably isn't performance critical but I thought it seemed wasteful to have a loop (that could theoretically go on forever) just to generate a char. The new version also seems faster when I benchmarked it.

test misc_gen_chars_old       ... bench:       1,031 ns/iter (+/- 177) = 496 MB/s
test misc_gen_chars_new       ... bench:         920 ns/iter (+/- 978) = 556 MB/s

@dhardy
Copy link
Member

dhardy commented Jun 20, 2018

Since the old version only rejected 0.2% of samples, that doesn't explain the performance increase. Simpler logic (no branching other than the subtraction, which might not even be implemented as a branch) may explain the improvement.

Looks good anyway! @pitdicker?

@dhardy
Copy link
Member

dhardy commented Jun 20, 2018

Might be faster if you sample from 0 to 0x11_0000 - GAP_SIZE (depending on optimiser).

@Pazzaz
Copy link
Contributor Author

Pazzaz commented Jun 20, 2018

Seems to give the same performance.

test misc_gen_chars       ... bench:         926 ns/iter (+/- 463) = 552 MB/s

@pitdicker
Copy link
Contributor

pitdicker commented Jun 20, 2018

Good job!

I think the distr_standard_codepoint is a bit more reliable than the one you added, or at least easier to compare with the benchmarks that measure the performance of only the PRNG, and benchmarks of other distributions. Can you remove the new benchmark?

Benchmark before:

test distr_standard_codepoint    ... bench:       2,137 ns/iter (+/- 5) = 1871 MB/s

With this PR:

test distr_standard_codepoint    ... bench:       1,943 ns/iter (+/- 51) = 2058 MB/s

With Uniform::new(0, 0x11_0000 - GAP_SIZE) (and the corresponding changes):

test distr_standard_codepoint    ... bench:       2,020 ns/iter (+/- 16) = 1980 MB/s

Not sure why the different range is a bit slower. It has to do one addition less in the range code. On the other hand the check and compensation for characters in the gap and above (n > 0xD800) will be true much more often.

Can you add a comment that this is investigated, and a range with (GAP_SIZE, 0x11_0000) seems faster?

@pitdicker
Copy link
Contributor

Somewhat funny: On Reddit there is a there is a unhappy discussion on the amount of unsafe code (and questionable use) in actix-web, yet we start adding more unsafe code 😄.

@dhardy
Copy link
Member

dhardy commented Jun 20, 2018

Hmm. My take from a quick glance at that discussion is (a) that unsafe is sometimes required and sometimes merely used for performance (as here), and (b) that one must be careful to make unsafe uses easy to review and not mask real issues.

In this case the code is quite easy to understand, so not a big issue I think.

@pitdicker
Copy link
Contributor

I agree, and am all for it in this case.

@Pazzaz
Copy link
Contributor Author

Pazzaz commented Jun 20, 2018

Can you remove the new benchmark?

Sorry, didn't see the existing one. I've removed it.

Can you add a comment that this is investigated.

Done

@pitdicker
Copy link
Contributor

Thank you!

@sicking
Copy link
Contributor

sicking commented Jun 20, 2018

One way to reduce the concern about unsafe code would be to add a debug_assert! which checks that char::from_u32(n) would have returned the same result.

@@ -44,15 +44,21 @@ pub struct Alphanumeric;
impl Distribution<char> for Standard {
#[inline]
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> char {
let range = Uniform::new(0u32, 0x11_0000);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this new(0u32, char::MAX as u32)? More explicit about what we're doing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is being removed. But I don't think this is a good idea; you're off by 1 (should be inclusive) and if char::MAX were ever to change we don't know now whether we should use the whole range (minus the existing gap). So better just to use local constants as in the current implementation.

@dhardy
Copy link
Member

dhardy commented Jun 21, 2018

One way to reduce the concern about unsafe code would be to add a debug_assert! which checks that char::from_u32(n) would have returned the same result.

I was thinking char::from_u32(n).unwrap_unchecked() where the latter method does a debug assert. But this requires a new Option method!

@dhardy dhardy merged commit af1303c into rust-random:master Jun 21, 2018
@TheIronBorn
Copy link
Collaborator

TheIronBorn commented Jun 23, 2018

Debug assertions aren't run with cargo test --release.
#[cfg(any(test, debug_assertions))] would be better

@sicking
Copy link
Contributor

sicking commented Jun 23, 2018

Yeah. A unwrap_unchecked() function isn't really implementable in the language as it currently stands, even with unsafe code. There's no way to unwrap an enum without using a full match like unwrap() does. I guess what you could do is use raw pointers and rely on the binary encoding of the Option, but that's shaky enough that I don't think it'll happen.

@dhardy
Copy link
Member

dhardy commented Jun 23, 2018

I wonder if it would be possible with some ugly trick like transmuting to enum UncheckedSomeOption<T> { Some(T), None(!) }. But we're way off topic.

I guess adding debug_assert!(char::from_u32(n).is_some()); would be reasonable here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants