Panic when using OrdMap #124

itamarst · 2020-03-21T14:17:24Z

thread '<unnamed>' panicked at 'Chunk::push_front: can't push to full chunk', /home/itamarst/.cargo/registry/src/github.com-1ecc6299db9ec823/sized-chunks-0.5.1/src/sized_chunk.rs:297:13
stack backtrace:                                                                              
   0: backtrace::backtrace::libunwind::trace                                                  
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
   1: backtrace::backtrace::trace_unsynchronized                                               
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:84                                          
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt        
             at src/libstd/sys_common/backtrace.rs:61                                          
   4: core::fmt::write                         
             at src/libcore/fmt/mod.rs:1025
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1426
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:65                                          
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:50                                          
   8: std::panicking::default_hook::{{closure}}                                                
             at src/libstd/panicking.rs:193
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:210
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:471
  11: std::panicking::begin_panic
             at /rustc/5e1a799842ba6ed4a57e91f7ab9435947482f7d8/src/libstd/panicking.rs:404    
  12: im::nodes::btree::Node<A>::remove_index
             at /home/itamarst/.cargo/registry/src/github.com-1ecc6299db9ec823/im-14.2.0/./src/nodes/btree.rs:0
  13: pymemprofile_free_allocation
             at /home/itamarst/.cargo/registry/src/github.com-1ecc6299db9ec823/im-14.2.0/./src/nodes/btree.rs:534
... unrelated code ...

I was unable to reproduce with later runs.

The basic usage pattern here is a series of insert()s and remove()s, with an ocassional clone().

The text was updated successfully, but these errors were encountered:

plaidfinch · 2020-04-08T22:42:51Z

I can confirm this panic—it occurred during benchmarking a data structure that uses OrdMap internally. It reliably occurs during my testing when the size of the map is at least 65_536, but occasionally shows up at smaller sizes.

I do not yet have a minimal example demonstrating this behavior because it's part of the context of a larger algorithm. The gist of the use pattern is that the map is repeatedly cloned, then one element is updated, then it is cloned again, over and over. I'll try to make a minimal example demonstrating this use pattern and see if it replicates the bug.

rjmac · 2020-06-23T00:32:46Z

I've run into this too, and I've got a reproduction. It's minimal in the sense that no line of the attached file can be removed without preventing the panic, and it seems (but I haven't exhaustively checked) that none of the matched insert/remove pairs can be removed either.

use std::io::prelude::*;
use std::fs::File;
use im_rc::OrdMap;

fn main() {
    let mut map = OrdMap::new();

    let mut file = File::open("ops.txt").unwrap();
    let mut contents = String::new();
    file.read_to_string(&mut contents).unwrap();

    for line in contents.split('\n') {
        println!("{}", line);
        if line.starts_with("insert ") {
            map.insert(line[7..].parse::<u32>().unwrap(), 0);
        } else if line.starts_with("remove ") {
            map.remove(&line[7..].parse::<u32>().unwrap());
        }
    }
}

The ops.txt file is attached

Removal could cause a value to be stolen from a neighbor node without checking that the node doing the stealing had capacity for it. It should be possible to construct a reproduction smaller than the one in the new test, but I haven't found one significantly so.

Removal could cause a value to be moved from one node to another without checking that the target node had room for it.

rjmac · 2020-06-28T00:08:01Z

I've been working on this a bit, and I'm fairly sure the problem is in nodes::btree::Node::remove_index, where it does StealFromLeft/StealFromRight in the Ok branch. Specifically, if the other child is already full, that panics. I'm not sure what the right thing to do is. I've tried a few things, but only managed to break the code in other ways.

I think that perhaps instead of

                    // If the left hand child has capacity, pull the predecessor up.
                    (&Some(ref left), _) if !left.too_small() => {
                        if left.is_leaf() {
                            RemoveAction::PullUp(left.keys.len() - 1, index, index)
                        } else {
                            RemoveAction::StealFromLeft(index + 1)
                        }
                    }

it should be something like

                    // If the left hand child has capacity, pull the predecessor up.
                    (&Some(ref left), &Some(ref right)) if !left.too_small() => {
                        if left.is_leaf() {
                            RemoveAction::PullUp(left.keys.len() - 1, index, index)
                        } else if right.has_room() {
                            RemoveAction::StealFromLeft(index + 1)
                        } else if left.has_room() {
                            // we know right isn't too small because it doesn't have room
                            RemoveAction::StealFromRight(index)
                        } else {
                             // both children are completely full, pick the minimum
                             // from the right child or the maximum from the left
                             // child and put it here, and do it without affecting
                             // the other child.
                             RemoveAction::ReplaceWithPreviousOrNext(index)
                        }
                    }
                    // the mirror branch pulling from `right` doesn't need to change,
                    // because left is too_small and therefore definitely has room
                    // to be participate in `StealFromRight`.

but I'm not positive of that, and in particular I'm not sure how to write the handler for ReplaceWithPreviousOrNext.

itamarst · 2020-07-10T15:22:44Z

I would love for this to be fixed so I can switch to OrdMap.

As some context, and a thank you for creating im-rs:

im-rs has been immensely helpful in making the Python memory profiler I've written (https://pythonspeed.com/products/filmemoryprofiler/) fairly fast: I'm storing an entry for every single call to malloc() and friends. However, right now I'm using the immutable HashMap, and the memory overhead is massive—tracking a couple million allocations that take 90MB of RAM has 900MB tracking overhead. Unless I'm measuring something wrong, OrdMap has much smaller overhead of 180MB.

* Fix bug in remove_index for OrdMap * Remove unused is_leaf * Reproducible test for issue #124 - hopefully. Co-authored-by: Bodil Stokke <bodil@bodil.org>

bodil · 2020-07-14T12:31:27Z

OK, I've merged #141 which fixes the repro case above, but of course it might not be the full story, given the original report isn't reproducible. We'll just have to hope...

itamarst · 2020-07-14T13:20:27Z

Thank you! I tried creating a proptest reproducer this weekend, and failed, everything passed. Might try again at some point.

rjmac · 2020-07-14T16:30:53Z

Here's a new fixture that fails with the current version.

This is the code I'm using to generate these, by the way (click to expand).

use ::std::collections::{BTreeMap, HashSet};
use ::std::io::prelude::*;
use ::std::io::BufReader;

use ::rand::rngs::StdRng;
use ::rand::{Rng, RngCore, SeedableRng, thread_rng};
use ::im_rc::OrdMap;

#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Debug)]
enum Op {
    Insert(u32),
    Remove(u32)
}

impl Op {
    fn value(&self) -> u32 {
        match self {
            Self::Insert(i) => *i,
            Self::Remove(i) => *i
        }
    }

    fn value_mut(&mut self) -> &mut u32 {
        match self {
            Self::Insert(i) => i,
            Self::Remove(i) => i
        }
    }
}

fn load() -> Vec<Op> {
    let mut result = Vec::new();

    for line in BufReader::new(::std::fs::File::open("ops.txt").expect("couldn't open ops.txt")).lines() {
        let line = line.expect("Failed to read line");
        if line.starts_with("insert ") {
            result.push(Op::Insert(line[7..].parse::<u32>().unwrap_or_else(|_| panic!("Unparsable line {:?}", line))))
        } else if line.starts_with("remove ") {
            result.push(Op::Remove(line[7..].parse::<u32>().unwrap_or_else(|_| panic!("Unparsable line {:?}", line))))
        } else {
            panic!("unintelligible line {:?}", line)
        }
    }

    result
}

fn randoms() {
    let mut seed: [u8; 32] = [0; 32];
    thread_rng().fill_bytes(&mut seed);
    eprintln!("{:?}", seed);

    let seed = seed;

    let _ = std::panic::catch_unwind(|| {
        let mut rng = StdRng::from_seed(seed);

        let mut map = OrdMap::new();
        let mut stdmap = BTreeMap::new();

        loop {
            if rng.next_u32() & 0x80 == 0 {
                let x = rng.gen_range(0, 8192*2);
                println!("insert {}", x);
                map.insert(x, ());
                stdmap.insert(x, ());
            } else {
                for _ in 0..=0 {
                    let x = rng.gen_range(0, 8192*2);
                    println!("remove {}", x);
                    map.remove(&x);
                    stdmap.remove(&x);
                }
            }
            if !itertools::equal(map.iter(), stdmap.iter()) {
                panic!("mismatch!");
            }
        }
    });
}

fn remove_superfluous(ops: &[Op]) -> Option<Vec<Op>> {
    let mut removed = HashSet::new();
    for i in 0..ops.len() {
        eprintln!("{}/{}", i, ops.len());
        removed.insert(i);
        if std::panic::catch_unwind(|| {
            let mut map = OrdMap::new();
            for (idx, op) in ops.iter().enumerate() {
                if !removed.contains(&idx) {
                    match op {
                        Op::Insert(x) => { map.insert(x, 0); }
                        Op::Remove(x) => { map.remove(&x); }
                    }
                }
            }
        }).is_ok() {
            removed.remove(&i);
        }
    }

    if !is_error(ops) {
        panic!("Expected to still be erroneous");
    }

    if removed.len() == 0 {
        None
    } else {
        let mut result = Vec::with_capacity(ops.len() - removed.len());
        for (idx, op) in ops.into_iter().enumerate() {
            if !removed.contains(&idx) {
                result.push(*op);
            }
        }
        Some(result)
    }
}

fn dump(ops: &[Op]) {
    for op in ops.iter() {
        match op {
            Op::Insert(x) => println!("insert {}", x),
            Op::Remove(x) => println!("remove {}", x)
        }
    }
}

fn is_error(ops: &[Op]) -> bool {
    std::panic::catch_unwind(|| {
        let mut map = OrdMap::new();
        for op in ops.iter() {
            match op {
                Op::Insert(x) => { map.insert(x, 0); }
                Op::Remove(x) => { map.remove(&x); }
            }
        }
    }).is_err()
}

fn traversal_order(ops: &[Op]) -> Vec<usize> {
    let mut decorated = ops.iter().enumerate().collect::<Vec<_>>();
    decorated.sort_by_key(|(_, v)| v.value());
    decorated.into_iter().map(|(i, _)| i).collect::<Vec<_>>()
}

fn shrink_values(ops: &mut [Op]) -> bool {
    let mut done = false;
    let mut did_one = false;

    while !done {
        done = true;

        if quick_shrink_values(ops) {
            did_one = true;
        }

        let order = traversal_order(ops);

        let mut i = 0;
        while i < ops.len() {
            eprintln!("{}/{}", i, ops.len());

            match ops[order[i]] {
                Op::Remove(v) if v > 0 => {
                    let mut found = None;
                    for j in (0..order[i]).rev() {
                        if ops[j] == Op::Insert(v) {
                            found = Some(j);
                            break;
                        } else if ops[j] == Op::Remove(v) {
                            eprintln!("Found double-remove; stopping insert probe");
                            break;
                        }
                    }
                    if let Some(j) = found {
                        ops[order[i]] = Op::Remove(v-1);
                        ops[j] = Op::Insert(v-1);
                        if is_error(ops) {
                            eprintln!("Shrinking pair {}/{} ({}) preserved the error", j, order[i], v);
                            done = false;
                        } else {
                            eprintln!("Shrinking pair {}/{} ({}) failed to preserve the error", j, order[i], v);
                            ops[order[i]] = Op::Remove(v);
                            ops[j] = Op::Insert(v);
                            i += 1;
                        }
                    } else {
                        eprintln!("Found no insert for {}", order[i]);
                        ops[order[i]] = Op::Remove(v-1);
                        if is_error(ops) {
                            eprintln!("Shrinking remove {} ({}) preserved the error", order[i], v);
                            done = false;
                        } else {
                            eprintln!("Shrinking remove {} ({}) failed to preserve the error", order[i], v);
                            ops[order[i]] = Op::Remove(v);
                            i += 1;
                        }
                    }
                }
                Op::Insert(v) if v > 0 => {
                    ops[order[i]] = Op::Insert(v-1);
                    if is_error(ops) {
                        eprintln!("Shrinking insert {} ({}) preserved the error", order[i], v);
                        done = false;
                    } else {
                        eprintln!("Shrinking insert {} ({}) failed to preserve the error", order[i], v);
                        ops[order[i]] = Op::Insert(v);
                        i += 1;
                    }
                }
                _ => {
                    i += 1;
                }
            }
        }
        if !done { did_one = true }
    }

    if !is_error(ops) {
        panic!("Expected to still be erroneous");
    }

    did_one
}

fn quick_shrink_values(ops: &mut [Op]) -> bool {
    let mut did_one = false;

    let order = traversal_order(ops);

    let mut i = 0;
    if i < ops.len() {
        {
            let cur = ops[order[i]].value();
            let mut j = i + 1;
            while j < ops.len() && ops[order[j]].value() == cur { j += 1; }
            for idx in i..j {
                *ops[order[idx]].value_mut() = 0;
            }
            i = j;

            did_one = cur != 0;
        }

        while i < ops.len() {
            let prev = ops[order[i - 1]].value();
            let cur = ops[order[i]].value();
            let mut j = i + 1;
            while j < ops.len() && ops[order[j]].value() == cur { j += 1; }
            for idx in i..j {
                *ops[order[idx]].value_mut() = prev + 1;
            }
            i = j;

            did_one |= cur != prev+1;
        }
    }

    if !is_error(ops) {
        panic!("Expected to still be erroneous");
    }

    did_one
}

fn main() {
    let cmd = std::env::args().skip(1).next().expect("No command specified");
    match cmd.as_ref() {
        "generate" => {
            randoms()
        }
        "shrink" => {
            let mut ops = load();
            loop {
                match remove_superfluous(&ops) {
                    None => {
                        if !shrink_values(&mut ops) {
                            break
                        }
                    }
                    Some(new_ops) => {
                        ops = new_ops;
                        shrink_values(&mut ops);
                    }
                }
            }
            dump(&ops);
        }
        _ =>
            panic!("unknown command {}", cmd)
    }
}

"generate" will write a sequence of "insert x" / "remove x" commands onto stdout. "shrink" reads "ops.txt", removes everything it can while preserving the error, and sends a new sequence of commands to stdout.

The "generate" command doesn't always actually find a problem (it took several tries to find this instance), but when it does it generally only takes a few seconds to do so.

bodil · 2020-07-14T16:35:09Z

Proptest has also found one or two issues that seem to have popped up after merging #141, so I guess we're reopening this.

…o case.

Summary: the `im`/`im-rc` crate contains logic bugs (we are hitting problems outlined in bodil/im-rs#124 sporadically) and appears unmaintained. This diff switches to the `rpds` crate which offers the same kind of functionality and appears to be better maintained. Reviewed By: CatherineGasnier Differential Revision: D34339520 fbshipit-source-id: 5fd5eab468f46d0fcc5294d3d7142e1c8541fbd5

rjmac pushed a commit to rjmac/im-rs that referenced this issue Jun 27, 2020

Fix bodil#124

e16aa34

Removal could cause a value to be moved from one node to another without checking that the target node had room for it.

rjmac pushed a commit to rjmac/im-rs that referenced this issue Jun 27, 2020

Fix bodil#124

af9c22b

Removal could cause a value to be moved from one node to another without checking that the target node had room for it.

dsarlis mentioned this issue Jul 1, 2020

Fix bug in remove_index for OrdMap #141

Merged

itamarst mentioned this issue Jul 10, 2020

Fil has a huge amount of memory overhead per allocation due to use of HashMap pythonspeed/filprofiler#45

Closed

bodil self-assigned this Jul 14, 2020

bodil added a commit to dsarlis/im-rs that referenced this issue Jul 14, 2020

Reproducible test for issue bodil#124 - hopefully.

1728d2b

bodil added a commit that referenced this issue Jul 14, 2020

Reproducible test for issue #124 - hopefully.

7aaf0ad

bodil closed this as completed in #141 Jul 14, 2020

bodil added a commit that referenced this issue Jul 14, 2020

Fix bug in remove_index for OrdMap (#141)

c35a913

* Fix bug in remove_index for OrdMap * Remove unused is_leaf * Reproducible test for issue #124 - hopefully. Co-authored-by: Bodil Stokke <bodil@bodil.org>

bodil added a commit that referenced this issue Jul 14, 2020

Make the issue #124 fixture work on Windows. 🤦‍♀️

d8d8198

bodil reopened this Jul 14, 2020

bodil added a commit that referenced this issue Jul 27, 2020

Use str::lines() instead of splitting and stripping for the #124 repr…

2cae6fb

…o case.

arthurprs mentioned this issue Oct 16, 2020

Fix OrdMap removal/range logic #154

Merged

xu-cheng mentioned this issue May 2, 2021

Is this crate still maintained? #185

Open

Eh2406 mentioned this issue Sep 8, 2021

Drop the im-rc dependency rust-lang/cargo#9878

Closed

bodil closed this as completed in #154 Apr 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Panic when using OrdMap #124

Panic when using OrdMap #124

itamarst commented Mar 21, 2020

plaidfinch commented Apr 8, 2020 •

edited

rjmac commented Jun 23, 2020

rjmac commented Jun 28, 2020

itamarst commented Jul 10, 2020 •

edited

bodil commented Jul 14, 2020

itamarst commented Jul 14, 2020

rjmac commented Jul 14, 2020

bodil commented Jul 14, 2020

Panic when using OrdMap #124

Panic when using OrdMap #124

Comments

itamarst commented Mar 21, 2020

plaidfinch commented Apr 8, 2020 • edited

rjmac commented Jun 23, 2020

rjmac commented Jun 28, 2020

itamarst commented Jul 10, 2020 • edited

bodil commented Jul 14, 2020

itamarst commented Jul 14, 2020

rjmac commented Jul 14, 2020

bodil commented Jul 14, 2020

plaidfinch commented Apr 8, 2020 •

edited

itamarst commented Jul 10, 2020 •

edited