New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Efficient creation of all-ones bitset #101
Comments
As of #86, your implementation of |
@james7132 Sorry if the answer is obvious, but are you referring to the first or the second implementation? Looking at the implmentation of |
Your second one is close to optimal now. There is no collect after #86. It preallocates a zeroed buffer of the right size then writes over it using the provided blocks up to the provided capacity. This does a second pass over the buffer, so we can probably make it more efficient by using a |
Thanks for the clarification! Would you accept a PR that adds a |
Yes go ahead! Should be pretty close to |
@james7132 My idea was to literally use the second implementation. Do you think that would be suboptimal? Ideally pub fn with_capacity_and_blocks<I: IntoIterator<Item = Block>>(bits: usize, blocks: I) -> Self {
unsafe {
// start off with uninitialized bitset for efficiency
let mut bitset = Self::with_capacity_uninit(bits);
let Range {
start: mut subblock,
end,
} = bitset.as_mut_slice().as_mut_ptr_range();
// copy data from `blocks`
for value in blocks {
if subblock == end {
break;
}
subblock.write(value);
subblock = subblock.add(1);
}
// zero out the rest
while subblock != end {
subblock.write(0);
subblock = subblock.add(1);
}
bitset
}
} Then a |
The main gotcha with that implementation is that a Vec<MaybeUninit> needs to be used and we cannot convert it to a type that is initialized until the entire vec is populated for that implementation to be sound. This likely means many of the private utility functions you're using there are likely to be unusable. |
Oh, right, thanks for explaining. I think we can avoid pub fn with_capacity_and_blocks<I: IntoIterator<Item = Block>>(bits: usize, blocks: I) -> Self {
if bits == 0 {
return Self::new();
}
let simd_block_cnt = bits.div_ceil(SimdBlock::BITS);
let block_cnt = bits.div_ceil(Block::BITS as usize);
// SAFETY: We use Vec::with_capacity() to obtain uninitialized memory, and
// initialize all of it before passing ownership to the returned FixedBitSet
unsafe {
let mut vec = Vec::<SimdBlock>::with_capacity(simd_block_cnt);
let mut subblock = vec.as_mut_ptr().cast::<Block>();
let subblock_end = subblock.add(block_cnt);
assert!(subblock_end != subblock); // we handle bits == 0 at the beginning
// copy as much as we can from blocks
for value in blocks {
subblock.write(value);
subblock = subblock.add(1);
if subblock == subblock_end {
break;
}
}
// zero out the remainder of the allocation
let simd_block_end = vec.as_mut_ptr().add(simd_block_cnt).cast::<Block>();
core::ptr::write_bytes(subblock, 0, simd_block_end.offset_from(subblock) as usize);
let data = NonNull::new_unchecked(vec.as_mut_ptr());
// FixedBitSet is taking over the ownership of vec's data
core::mem::forget(vec);
FixedBitSet {
data,
capacity: simd_block_cnt,
length: bits,
}
}
} |
I need to create a bitset of known size with all bits set. Currently I do it with:
This is not hard to write, but unlike
Vec::with_capacity()
,FixedBitSet::with_capacity()
initializes the bitset to zeros, only for my code to immediately set it to ones. With largish bitsets I'd like to avoid the unnecessary initial resetting. Looking at assembly output in release mode, it seems that the compiler doesn't inlinewith_capacity()
, from which I infer that it cannot eliminate the initial zeroing.The question is, what is the most efficient way to create an all-ones bitset? Is it the above code, or rather something like:
And should the crate include a constructor like
FixedBitSet::ones_with_capacity()
(possibly with a more optimized implementation)?The text was updated successfully, but these errors were encountered: