Skip to content

Commit

Permalink
Auto merge of #6283 - Eh2406:slow-case, r=alexcrichton
Browse files Browse the repository at this point in the history
Use a ad hoc trie to avoid slow cases

This adds a test for the pure case described [here](#6258 (comment)). When added, this test will time out in debug (~20 sec release) with an `N` of 3.

Looking with a profiler, it is spending most of its time in [`Vec.contains`](https://github.com/rust-lang/cargo/blob/d08fd375db0014f097ee3fbdc1c595b632669f6a/src/cargo/core/resolver/conflict_cache.rs#L84). Keeping a deduplicated list with contains is `O(n^2)`, so let's change things to be `Ord` so we can use a btree to deduplicate.

Now the profiler claimed that we are spending our time [searching that `Vec`](https://github.com/rust-lang/cargo/blob/d08fd375db0014f097ee3fbdc1c595b632669f6a/src/cargo/core/resolver/conflict_cache.rs#L66). So time to work on that "vaporware", the simplest thing I could think of that lets us check for is_active in better then `O(n)` is nested hashmaps. Each hashmap has keys of a shared Id, and values of hashmaps representing all the conflicts that use that Id. When searching if the key is not active then we need not look at any of its descendants.

There are lots of ways to make this better but even this much gets the test to pass with `N` of 3. So maybe we can justify the improvements with `N` of 4? No, eavan in debug `N` of 4 hits the [50k limit](https://github.com/rust-lang/cargo/blob/d08fd375db0014f097ee3fbdc1c595b632669f6a/src/cargo/core/resolver/types.rs#L55), so the that is as far as we need to go on the conflict_cache.
  • Loading branch information
bors committed Nov 9, 2018
2 parents 241fac0 + 178ee0b commit 1292a6d
Show file tree
Hide file tree
Showing 5 changed files with 169 additions and 62 deletions.
155 changes: 117 additions & 38 deletions src/cargo/core/resolver/conflict_cache.rs
Original file line number Diff line number Diff line change
@@ -1,40 +1,125 @@
use std::collections::{HashMap, HashSet};
use std::collections::{BTreeMap, HashMap, HashSet};

use super::types::ConflictReason;
use core::resolver::Context;
use core::{Dependency, PackageId};

/// This is a Trie for storing a large number of Sets designed to
/// efficiently see if any of the stored Sets are a subset of a search Set.
enum ConflictStoreTrie {
/// a Leaf is one of the stored Sets.
Leaf(BTreeMap<PackageId, ConflictReason>),
/// a Node is a map from an element to a subTrie where
/// all the Sets in the subTrie contains that element.
Node(HashMap<PackageId, ConflictStoreTrie>),
}

impl ConflictStoreTrie {
/// Finds any known set of conflicts, if any,
/// which are activated in `cx` and pass the `filter` specified?
fn find_conflicting<F>(
&self,
cx: &Context,
filter: &F,
) -> Option<&BTreeMap<PackageId, ConflictReason>>
where
for<'r> F: Fn(&'r &BTreeMap<PackageId, ConflictReason>) -> bool,
{
match self {
ConflictStoreTrie::Leaf(c) => {
if filter(&c) {
// is_conflicting checks that all the elements are active,
// but we have checked each one by the recursion of this function.
debug_assert!(cx.is_conflicting(None, c));
Some(c)
} else {
None
}
}
ConflictStoreTrie::Node(m) => {
for (pid, store) in m {
// if the key is active then we need to check all of the corresponding subTrie.
if cx.is_active(pid) {
if let Some(o) = store.find_conflicting(cx, filter) {
return Some(o);
}
} // else, if it is not active then there is no way any of the corresponding
// subTrie will be conflicting.
}
None
}
}
}

fn insert<'a>(
&mut self,
mut iter: impl Iterator<Item = &'a PackageId>,
con: BTreeMap<PackageId, ConflictReason>,
) {
if let Some(pid) = iter.next() {
if let ConflictStoreTrie::Node(p) = self {
p.entry(pid.clone())
.or_insert_with(|| ConflictStoreTrie::Node(HashMap::new()))
.insert(iter, con);
} // else, We already have a subset of this in the ConflictStore
} else {
// we are at the end of the set we are adding, there are 3 cases for what to do next:
// 1. self is a empty dummy Node inserted by `or_insert_with`
// in witch case we should replace it with `Leaf(con)`.
// 2. self is a Node because we previously inserted a superset of
// the thing we are working on (I don't know if this happens in practice)
// but the subset that we are working on will
// always match any time the larger set would have
// in witch case we can replace it with `Leaf(con)`.
// 3. self is a Leaf that is in the same spot in the structure as
// the thing we are working on. So it is equivalent.
// We can replace it with `Leaf(con)`.
if cfg!(debug_assertions) {
if let ConflictStoreTrie::Leaf(c) = self {
let a: Vec<_> = con.keys().collect();
let b: Vec<_> = c.keys().collect();
assert_eq!(a, b);
}
}
*self = ConflictStoreTrie::Leaf(con)
}
}
}

pub(super) struct ConflictCache {
// `con_from_dep` is a cache of the reasons for each time we
// backtrack. For example after several backtracks we may have:
//
// con_from_dep[`foo = "^1.0.2"`] = vec![
// map!{`foo=1.0.1`: Semver},
// map!{`foo=1.0.0`: Semver},
// ];
// con_from_dep[`foo = "^1.0.2"`] = map!{
// `foo=1.0.1`: map!{`foo=1.0.1`: Semver},
// `foo=1.0.0`: map!{`foo=1.0.0`: Semver},
// };
//
// This can be read as "we cannot find a candidate for dep `foo = "^1.0.2"`
// if either `foo=1.0.1` OR `foo=1.0.0` are activated".
//
// Another example after several backtracks we may have:
//
// con_from_dep[`foo = ">=0.8.2, <=0.9.3"`] = vec![
// map!{`foo=0.8.1`: Semver, `foo=0.9.4`: Semver},
// ];
// con_from_dep[`foo = ">=0.8.2, <=0.9.3"`] = map!{
// `foo=0.8.1`: map!{
// `foo=0.9.4`: map!{`foo=0.8.1`: Semver, `foo=0.9.4`: Semver},
// }
// };
//
// This can be read as "we cannot find a candidate for dep `foo = ">=0.8.2,
// <=0.9.3"` if both `foo=0.8.1` AND `foo=0.9.4` are activated".
//
// This is used to make sure we don't queue work we know will fail. See the
// discussion in https://github.com/rust-lang/cargo/pull/5168 for why this
// is so important, and there can probably be a better data structure here
// but for now this works well enough!
// is so important. The nested HashMaps act as a kind of btree, that lets us
// look up which entries are still active without
// linearly scanning through the full list.
//
// Also, as a final note, this map is *not* ever removed from. This remains
// as a global cache which we never delete from. Any entry in this map is
// unconditionally true regardless of our resolution history of how we got
// here.
con_from_dep: HashMap<Dependency, Vec<HashMap<PackageId, ConflictReason>>>,
con_from_dep: HashMap<Dependency, ConflictStoreTrie>,
// `dep_from_pid` is an inverse-index of `con_from_dep`.
// For every `PackageId` this lists the `Dependency`s that mention it in `dep_from_pid`.
dep_from_pid: HashMap<PackageId, HashSet<Dependency>>,
Expand All @@ -54,47 +139,41 @@ impl ConflictCache {
cx: &Context,
dep: &Dependency,
filter: F,
) -> Option<&HashMap<PackageId, ConflictReason>>
) -> Option<&BTreeMap<PackageId, ConflictReason>>
where
for<'r> F: FnMut(&'r &HashMap<PackageId, ConflictReason>) -> bool,
for<'r> F: Fn(&'r &BTreeMap<PackageId, ConflictReason>) -> bool,
{
self.con_from_dep
.get(dep)?
.iter()
.rev() // more general cases are normally found letter. So start the search there.
.filter(filter)
.find(|conflicting| cx.is_conflicting(None, conflicting))
self.con_from_dep.get(dep)?.find_conflicting(cx, &filter)
}
pub fn conflicting(
&self,
cx: &Context,
dep: &Dependency,
) -> Option<&HashMap<PackageId, ConflictReason>> {
) -> Option<&BTreeMap<PackageId, ConflictReason>> {
self.find_conflicting(cx, dep, |_| true)
}

/// Add to the cache a conflict of the form:
/// `dep` is known to be unresolvable if
/// all the `PackageId` entries are activated
pub fn insert(&mut self, dep: &Dependency, con: &HashMap<PackageId, ConflictReason>) {
let past = self
.con_from_dep
pub fn insert(&mut self, dep: &Dependency, con: &BTreeMap<PackageId, ConflictReason>) {
self.con_from_dep
.entry(dep.clone())
.or_insert_with(Vec::new);
if !past.contains(con) {
trace!(
"{} = \"{}\" adding a skip {:?}",
dep.package_name(),
dep.version_req(),
con
);
past.push(con.clone());
for c in con.keys() {
self.dep_from_pid
.entry(c.clone())
.or_insert_with(HashSet::new)
.insert(dep.clone());
}
.or_insert_with(|| ConflictStoreTrie::Node(HashMap::new()))
.insert(con.keys(), con.clone());

trace!(
"{} = \"{}\" adding a skip {:?}",
dep.package_name(),
dep.version_req(),
con
);

for c in con.keys() {
self.dep_from_pid
.entry(c.clone())
.or_insert_with(HashSet::new)
.insert(dep.clone());
}
}
pub fn dependencies_conflicting_with(&self, pid: &PackageId) -> Option<&HashSet<Dependency>> {
Expand Down
17 changes: 9 additions & 8 deletions src/cargo/core/resolver/context.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
use std::collections::{HashMap, HashSet};
use std::collections::{BTreeMap, HashMap, HashSet};
use std::rc::Rc;

use core::interning::InternedString;
use core::{Dependency, FeatureValue, PackageId, SourceId, Summary};
use util::CargoResult;
use util::Graph;

use super::types::{ConflictReason, DepInfo, GraphNode, Method, RcList, RegistryQueryer};
use super::errors::ActivateResult;
use super::types::{ConflictReason, DepInfo, GraphNode, Method, RcList, RegistryQueryer};

pub use super::encode::{EncodableDependency, EncodablePackageId, EncodableResolve};
pub use super::encode::{Metadata, WorkspaceResolve};
Expand Down Expand Up @@ -133,7 +133,7 @@ impl Context {
.unwrap_or(&[])
}

fn is_active(&self, id: &PackageId) -> bool {
pub fn is_active(&self, id: &PackageId) -> bool {
self.activations
.get(&(id.name(), id.source_id().clone()))
.map(|v| v.iter().any(|s| s.package_id() == id))
Expand All @@ -145,7 +145,7 @@ impl Context {
pub fn is_conflicting(
&self,
parent: Option<&PackageId>,
conflicting_activations: &HashMap<PackageId, ConflictReason>,
conflicting_activations: &BTreeMap<PackageId, ConflictReason>,
) -> bool {
conflicting_activations
.keys()
Expand Down Expand Up @@ -186,10 +186,11 @@ impl Context {
// name.
let base = reqs.deps.get(&dep.name_in_toml()).unwrap_or(&default_dep);
used_features.insert(dep.name_in_toml());
let always_required = !dep.is_optional() && !s
.dependencies()
.iter()
.any(|d| d.is_optional() && d.name_in_toml() == dep.name_in_toml());
let always_required = !dep.is_optional()
&& !s
.dependencies()
.iter()
.any(|d| d.is_optional() && d.name_in_toml() == dep.name_in_toml());
if always_required && base.0 {
self.warnings.push(format!(
"Package `{}` does not have feature `{}`. It has a required dependency \
Expand Down
4 changes: 2 additions & 2 deletions src/cargo/core/resolver/errors.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use std::collections::HashMap;
use std::collections::BTreeMap;
use std::fmt;

use core::{Dependency, PackageId, Registry, Summary};
Expand Down Expand Up @@ -73,7 +73,7 @@ pub(super) fn activation_error(
registry: &mut Registry,
parent: &Summary,
dep: &Dependency,
conflicting_activations: &HashMap<PackageId, ConflictReason>,
conflicting_activations: &BTreeMap<PackageId, ConflictReason>,
candidates: &[Candidate],
config: Option<&Config>,
) -> ResolveError {
Expand Down
8 changes: 4 additions & 4 deletions src/cargo/core/resolver/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ fn activate_deps_loop(
//
// This is a map of package id to a reason why that packaged caused a
// conflict for us.
let mut conflicting_activations = HashMap::new();
let mut conflicting_activations = BTreeMap::new();

// When backtracking we don't fully update `conflicting_activations`
// especially for the cases that we didn't make a backtrack frame in the
Expand Down Expand Up @@ -641,7 +641,7 @@ struct BacktrackFrame {
parent: Summary,
dep: Dependency,
features: Rc<Vec<InternedString>>,
conflicting_activations: HashMap<PackageId, ConflictReason>,
conflicting_activations: BTreeMap<PackageId, ConflictReason>,
}

/// A helper "iterator" used to extract candidates within a current `Context` of
Expand Down Expand Up @@ -688,7 +688,7 @@ impl RemainingCandidates {
/// original list for the reason listed.
fn next(
&mut self,
conflicting_prev_active: &mut HashMap<PackageId, ConflictReason>,
conflicting_prev_active: &mut BTreeMap<PackageId, ConflictReason>,
cx: &Context,
dep: &Dependency,
) -> Option<(Candidate, bool)> {
Expand Down Expand Up @@ -781,7 +781,7 @@ fn find_candidate(
backtrack_stack: &mut Vec<BacktrackFrame>,
parent: &Summary,
backtracked: bool,
conflicting_activations: &HashMap<PackageId, ConflictReason>,
conflicting_activations: &BTreeMap<PackageId, ConflictReason>,
) -> Option<(Candidate, bool, BacktrackFrame)> {
while let Some(mut frame) = backtrack_stack.pop() {
let next = frame.remaining_candidates.next(
Expand Down
47 changes: 37 additions & 10 deletions tests/testsuite/resolve.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ use cargo::util::Config;
use support::project;
use support::registry::Package;
use support::resolver::{
assert_contains, assert_same, dep, dep_kind, dep_loc, dep_req, loc_names, names,
pkg, pkg_dep, pkg_id, pkg_loc, registry, registry_strategy, resolve, resolve_and_validated,
assert_contains, assert_same, dep, dep_kind, dep_loc, dep_req, loc_names, names, pkg, pkg_dep,
pkg_id, pkg_loc, registry, registry_strategy, resolve, resolve_and_validated,
resolve_with_config, PrettyPrintRegistry, ToDep, ToPkgId,
};

Expand Down Expand Up @@ -433,14 +433,12 @@ fn resolving_incompat_versions() {
pkg!("bar" => [dep_req("foo", "=1.0.2")]),
]);

assert!(
resolve(
&pkg_id("root"),
vec![dep_req("foo", "=1.0.1"), dep("bar")],
&reg
)
.is_err()
);
assert!(resolve(
&pkg_id("root"),
vec![dep_req("foo", "=1.0.1"), dep("bar")],
&reg
)
.is_err());
}

#[test]
Expand Down Expand Up @@ -1174,3 +1172,32 @@ fn hard_equality() {
&names(&[("root", "1.0.0"), ("foo", "1.0.0"), ("bar", "1.0.0")]),
);
}

#[test]
fn large_conflict_cache() {
let mut input = vec![
pkg!(("last", "0.0.0") => [dep("bad")]), // just to make sure last is less constrained
];
let mut root_deps = vec![dep("last")];
const NUM_VERSIONS: u8 = 3;
for name in 0..=NUM_VERSIONS {
// a large number of conflicts can easily be generated by a sys crate.
let sys_name = format!("{}-sys", (b'a' + name) as char);
let in_len = input.len();
input.push(pkg!(("last", format!("{}.0.0", in_len)) => [dep_req(&sys_name, "=0.0.0")]));
root_deps.push(dep_req(&sys_name, ">= 0.0.1"));

// a large number of conflicts can also easily be generated by a major release version.
let plane_name = format!("{}", (b'a' + name) as char);
let in_len = input.len();
input.push(pkg!(("last", format!("{}.0.0", in_len)) => [dep_req(&plane_name, "=1.0.0")]));
root_deps.push(dep_req(&plane_name, ">= 1.0.1"));

for i in 0..=NUM_VERSIONS {
input.push(pkg!((&sys_name, format!("{}.0.0", i))));
input.push(pkg!((&plane_name, format!("1.0.{}", i))));
}
}
let reg = registry(input);
let _ = resolve(&pkg_id("root"), root_deps, &reg);
}

0 comments on commit 1292a6d

Please sign in to comment.