Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add BlockedBitpacker #1030

Merged
merged 8 commits into from
May 3, 2021
Merged

add BlockedBitpacker #1030

merged 8 commits into from
May 3, 2021

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Apr 29, 2021

move tantivy bitpacker to subcrate
remove byteorder dependency
use BlockedBitpacker in IntFastFieldWriter

@PSeitz
Copy link
Contributor Author

PSeitz commented Apr 29, 2021

#1027

@PSeitz PSeitz requested a review from fulmicoton April 29, 2021 18:28
fix memory_usage calculation
calc mem_usage of more structs in index creation
add some comments
}

pub fn flush(&mut self) {
if self.cache.is_empty() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick...

nowadays I like to avoid the unwrap using the following pattern instead of line 90 and line 86-88

let base_value = if let Some(min_value) = self.cache.iter().min() {
   min_value
} else {
   return;
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that, but there is still a unwrap() in line 96

.cache
.iter()
.map(|val| compute_num_bits(*val - base_value))
.max()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is minmax in common. You can move it into bitpacker maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think the current version is easier to understand, since we don't use min here

let num_bits_block = self
    .buffer
    .iter()
    .map(|val| compute_num_bits(*val - base_value))
    .max()
    .unwrap();

let (_, num_bits_block) = minmax(
    self.buffer
        .iter()
        .map(|val| compute_num_bits(*val - base_value)),
).unwrap();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i meant combine with the idea above of course

let (base_val, max_val) = if let Some((min_val, val)) = min_max() {
   (min_val, max_val)
} else {
   return
};
let num_bits = compute_num_bits(max_val - base_val);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sure that makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, 10% faster

num_bits_block,
&mut self.compressed_blocks,
)
.expect("cannot write bitpacking to output"); // write to im can't fail
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im means in memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

move minmax to bitpacker
use minmax in blocked bitpacker
pub fn flush(&mut self) {
if let Some((min_value, max_value)) = minmax(self.buffer.iter()) {
let mut bit_packer = BitPacker::new();
let num_bits_block = compute_num_bits(*max_value - min_value);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprise this * compiles... what is it about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a residue from before. It seems usize implements sub on &usize

https://doc.rust-lang.org/std/primitive.usize.html#impl-Sub%3C%26%27_%20usize%3E

@fulmicoton fulmicoton merged commit ec4834c into quickwit-oss:main May 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants