Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BitSet #235

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 12 additions & 0 deletions doc/set.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Automatically-Managed Index Set

This module defines the [`BitSet`] collection as a useful wrapper over a
[`BitVec`].

A `BitVec` is a very efficient way of storing a set of [`usize`] values since
the various set operations can be easily represented using bit operations.
However, a `BitVec` is less ergonomic than a `BitSet` because of the need to
resize when inserting elements larger than any already in the set.

[`BitSet`]: crate::set::BitSet
[`BitVec`]: crate::vec::BitVec
70 changes: 70 additions & 0 deletions doc/set/BitSet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Packed-Bits Set

This is a data structure that consists of an automatically managed [`BitVec`]
which stores a set of `usize` values as `true` bits in the `BitVec`.

The main benefit of this structure is the automatic handling of the memory
backing the [`BitVec`], which must be resized to account for the sizes of data
inside it. If you know the bounds of your data ahead of time, you may prefer to
use a regular [`BitVec`] or even a [`BitArray`] instead, the latter of which
will be allocated on the stack instead of the heap.

## Documentation Practices

`BitSet` attempts to replicate the API of the standard-library `BTreeSet` type,
including inherent methods, trait implementations, and relationships with the
[`BitSet`] analogue.

Items that are either direct ports, or renamed variants, of standard-library
APIs will have a `## Original` section that links to their standard-library
documentation. Items that map to standard-library APIs but have a different API
signature will also have an `## API Differences` section that describes what
the difference is, why it exists, and how to transform your code to fit it. For
example:

## Original

[`BTreeSet<T>`](alloc::collections::BTreeSet)

## API Differences

As with all `bitvec` data structures, this takes two type parameters `<T, O>`
that govern the bit-vector’s storage representation in the underlying memory,
and does *not* take a type parameter to govern what data type it stores (always
`usize`)

### Accessing the internal [`BitVec`]

Since `BitSet` is merely an API over the internal `BitVec`, you can freely
take ownership of the internal buffer or borrow the buffer as a `BitSlice`.

However, since would be inconsistent with the set-style API, these require
dedicated methods instead of simple deref:

```rust
use bitvec::prelude::*;
use bitvec::set::BitSet;

fn mutate_bitvec(vec: &mut BitVec) {
// …
}

fn read_bitslice(bits: &BitSlice) {
// …
}

let mut bs: BitSet = BitSet::new();
bs.insert(10);
bs.insert(20);
bs.insert(30);
read_bitslice(bs.as_bitslice());
mutate_bitvec(bs.as_mut_bitvec());
```

Since a `BitSet` requires no additional invariants over `BitVec`, any mutations
to the internal vec are allowed without restrictions. For more details on the
safety guarantees of [`BitVec`], see its specific documentation.

[`BitArray`]: crate::array::BitArray
[`BitSet`]: crate::set::BitSet
[`BitVec`]: crate::vec::BitVec
14 changes: 14 additions & 0 deletions doc/set/iter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Bit-Set Iteration

This module provides iteration protocols for `BitSet`, including:

- extension of existing bit-sets with new data
- collection of data into new bit-sets
- iteration over the contents of a bit-sets

`BitSet` implements `Extend` and `FromIterator` for sources of `usize`.

Since the implementation is the same for sets, the [`IterOnes`] iterator from
the `slice` module is used for the set iterator instead of a wrapper.

[`IterOnes`]: crate::slice::IterOnes
33 changes: 33 additions & 0 deletions doc/set/iter/Range.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Bit-Set Range Iteration

This view iterates over the elements in a bit-set within a given range. It is
created by the [`BitSet::range`] method.

## Original

[`btree_map::Range`](alloc::collections::btree_map::Range)

## API Differences

Since the `usize` are not physically stored in the set, this yields `usize`
values instead of references.

## Examples

```rust
use bitvec::prelude::*;
use bitvec::set::BitSet;

let mut bs: BitSet = BitSet::new();
bs.insert(1);
bs.insert(2);
bs.insert(3);
bs.insert(4);
for val in bs.range(2..6) {
# #[cfg(feature = "std")] {
println!("{val}");
# }
}
```

[`BitSet::range`]: crate::set::BitSet::range
2 changes: 1 addition & 1 deletion rustfmt-stable.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# attr_fn_like_width = 70 # Leave implicit
# chain_width = 60 # Leave implicit
edition = "2018"
fn_args_layout = "Tall"
fn_params_layout = "Tall"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes were made because the original name was deprecated in the latest nightly rustfmt, and I needed to make them for just format to work properly. I updated the stable version too since I figured it'd be a good idea.

# fn_call_width = 60 # Leave implicit
force_explicit_abi = true
hard_tabs = true
Expand Down
4 changes: 2 additions & 2 deletions rustfmt.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# attr_fn_like_width = 70 # Leave implicit
# chain_width = 60 # Leave implicit
edition = "2018"
fn_args_layout = "Tall"
fn_params_layout = "Tall"
# fn_call_width = 60 # Leave implicit
force_explicit_abi = true
hard_tabs = true
Expand Down Expand Up @@ -69,7 +69,7 @@ normalize_comments = false
normalize_doc_attributes = false
overflow_delimited_expr = true
reorder_impl_items = true
required_version = "1.5.1"
required_version = "1.6.0"
skip_children = false
space_after_colon = true
space_before_colon = false
Expand Down
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ pub mod mem;
pub mod order;
pub mod ptr;
mod serdes;
pub mod set;
pub mod slice;
pub mod store;
pub mod vec;
Expand Down
188 changes: 188 additions & 0 deletions src/set.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
#![doc = include_str!("../doc/set.md")]
#![cfg(feature = "alloc")]

#[cfg(not(feature = "std"))]
use alloc::vec;
use core::ops;

use wyz::comu::{
Const,
Mut,
};

use crate::{
boxed::BitBox,
order::{
BitOrder,
Lsb0,
},
ptr::BitPtr,
slice::BitSlice,
store::BitStore,
vec::BitVec,
};

mod api;
mod iter;
mod traits;

pub use iter::Range;

#[repr(transparent)]
#[doc = include_str!("../doc/set/BitSet.md")]
pub struct BitSet<T = usize, O = Lsb0>
where
T: BitStore,
O: BitOrder,
{
inner: BitVec<T, O>,
}

/// Constructors.
impl<T, O> BitSet<T, O>
where
T: BitStore,
O: BitOrder,
{
/// An empty bit-set with no backing allocation.
pub const EMPTY: Self = Self {
inner: BitVec::EMPTY,
};

/// Creates a new bit-set for a range of indices.
#[inline]
pub fn from_range(range: ops::Range<usize>) -> Self {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This felt like the best analogue for BitVec::repeat to me. Since only allowing RangeFrom seemed weird, I decided to accept a proper Range instead.

let mut inner = BitVec::with_capacity(range.end);
unsafe {
inner.set_len(range.end);
inner[.. range.start].fill(false);
inner[range.start ..].fill(true);
}
Self { inner }
}

/// Constructs a new bit-set from an existing bit-vec.
#[inline]
pub fn from_bitvec(inner: BitVec<T, O>) -> Self {
Self { inner }
}
}

/// Converters.
impl<T, O> BitSet<T, O>
where
T: BitStore,
O: BitOrder,
{
/// Explicitly views the bit-set as a bit-slice.
#[inline]
pub fn as_bitslice(&self) -> &BitSlice<T, O> {
self.inner.as_bitslice()
}

/// Explicitly views the bit-set as a mutable bit-slice.
#[inline]
pub fn as_mut_bitslice(&mut self) -> &mut BitSlice<T, O> {
self.inner.as_mut_bitslice()
}

/// Explicitly views the bit-set as a bit-vec.
#[inline]
pub fn as_bitvec(&self) -> &BitVec<T, O> {
&self.inner
}

/// Explicitly views the bit-set as a mutable bit-vec.
#[inline]
pub fn as_mut_bitvec(&mut self) -> &mut BitVec<T, O> {
&mut self.inner
}

/// Views the bit-set as a slice of its underlying memory elements.
#[inline]
pub fn as_raw_slice(&self) -> &[T] {
self.inner.as_raw_slice()
}

/// Views the bit-set as a mutable slice of its underlying memory
/// elements.
#[inline]
pub fn as_raw_mut_slice(&mut self) -> &mut [T] {
self.inner.as_raw_mut_slice()
}

/// Creates an unsafe shared bit-pointer to the start of the buffer.
///
/// ## Original
///
/// [`Vec::as_ptr`](alloc::vec::Vec::as_ptr)
///
/// ## Safety
///
/// You must initialize the contents of the underlying buffer before
/// accessing memory through this pointer. See the `BitPtr` documentation
/// for more details.
#[inline]
pub fn as_bitptr(&self) -> BitPtr<Const, T, O> {
self.inner.as_bitptr()
}

/// Creates an unsafe writable bit-pointer to the start of the buffer.
///
/// ## Original
///
/// [`Vec::as_mut_ptr`](alloc::vec::Vec::as_mut_ptr)
///
/// ## Safety
///
/// You must initialize the contents of the underlying buffer before
/// accessing memory through this pointer. See the `BitPtr` documentation
/// for more details.
#[inline]
pub fn as_mut_bitptr(&mut self) -> BitPtr<Mut, T, O> {
self.inner.as_mut_bitptr()
}

/// Converts a bit-set into a boxed bit-slice.
///
/// This may cause a reällocation to drop any excess capacity.
///
/// ## Original
///
/// [`Vec::into_boxed_slice`](alloc::vec::Vec::into_boxed_slice)
#[inline]
pub fn into_boxed_bitslice(self) -> BitBox<T, O> {
self.inner.into_boxed_bitslice()
}

/// Converts a bit-set into a bit-vec.
#[inline]
pub fn into_bitvec(self) -> BitVec<T, O> {
self.inner
}
}

/// Utilities.
impl<T, O> BitSet<T, O>
where
T: BitStore,
O: BitOrder,
{
/// Shrinks the inner vector to the minimum size, without changing capacity.
#[inline]
fn shrink_inner(&mut self) {
match self.inner.last_one() {
Some(idx) => self.inner.truncate(idx + 1),
None => self.inner.clear(),
}
}

/// Immutable shrink as a bitslice.
#[inline]
fn shrunken(&self) -> &BitSlice<T, O> {
match self.inner.last_one() {
Some(idx) => &self.inner[.. idx + 1],
None => Default::default(),
}
}
}