Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to run the benchmarks #8

Closed
PayasR opened this issue Aug 14, 2019 · 6 comments
Closed

Document how to run the benchmarks #8

PayasR opened this issue Aug 14, 2019 · 6 comments

Comments

@PayasR
Copy link

PayasR commented Aug 14, 2019

Hello!
Could you please share how you ran the benchmarks listed in the README?
The only tests I can see are in the lib.rs, and running cargo test after removing the #[ignore] still compiles the code in unoptimized mode. 'grep'-ing for '#[bench]' also shows nothing.

@easbar
Copy link
Owner

easbar commented Aug 14, 2019

You can compile with optimizations also for tests. I just checked my notes and found this:

export RUST_TEST_THREADS=1; cargo test --release -- --ignored --nocapture

I think I read about #[bench] but then thought its not really helpful (but do not remember why unfortunately).

@PayasR
Copy link
Author

PayasR commented Aug 14, 2019

Okay, that works. There are a few known issues that should be kept in mind in the future, if we ever hit them.
And since we are here, what is the point of 'expected_checksum' and 'expected_num_not_found' as parameters to run_performance_test in lib.rs? There are a few hardcoded values in the code, but how can I compute them for new graphs?

@PayasR
Copy link
Author

PayasR commented Aug 14, 2019

Also, the standard way of benchmarking in Rust seems to be using the Criterion crate. This could be a good first issue, moving the benchmarks in lib.rs to using it.

Also, selecting 1000 nodes at random isn't usually sufficient, since most queries are local in real-world settings. 'Dijkstra ranks' are typically used to measure the performance of algorithms and their implementations (see paragraph 2, Page 24 and figure 8 here), so that is another something to do 😅

@easbar
Copy link
Owner

easbar commented Aug 15, 2019

There are a few known issues that should be kept in mind in the future, if we ever hit them.

Ok, but we are not using the profile.release configuration, so no real problem here ? Or are there any other issues you are aware of ? But good to know anyway.

what is the point of 'expected_checksum' and 'expected_num_not_found' as parameters to run_performance_test in lib.rs? There are a few hardcoded values in the code, but how can I compute them for new graphs?

These are checksums for the total weight of all paths and the number of paths that were not found. I got them simply by running the test. If you are doing performance optimization for example the check will tell you whether or not the calculated paths have changed (which you normally do not want).

the standard way of benchmarking in Rust seems to be using the Criterion crate

What would be the advantages of this ?

Also, selecting 1000 nodes at random isn't usually sufficient, since most queries are local in real-world settings. 'Dijkstra ranks' are typically used to measure the performance of algorithms and their implementations

Well, whether or not queries are 'local' in real world settings depends on the size of the map etc. (Assuming we are thinking about our graph representing a road network). I found the averaged query times to be quite stable against the selection of the sources/targets, so I think this metric is still very helpful. Its often used in scientific publications as well (like fig.7 in the paper you mentioned). For larger maps one could still restrict the area the sources/targets are taken from to the size of a city for example. But if you need it please feel free to contribute a test setup to measure for different dijkstra ranks.

@PayasR
Copy link
Author

PayasR commented Aug 15, 2019

Ok, but we are not using the profile.release configuration, so no real problem here ? Or are there any other issues you are aware of ? But good to know anyway.

The only issues I am aware of are the ones surrounding profiles, so currently, no problems.

These are checksums for the total weight of all paths and the number of paths that were not found.

But aren't the paths between random sources and destinations? If so, how is the checksum constant for different times the test is run? This is the code I'm confused about:

for _i in 0..num_queries {
            let source = rng.gen_range(0, num_nodes);
            let target = rng.gen_range(0, num_nodes);
            time.start();
            let path = calc_path(source, target);
            time.stop();
            match path {
                //--> Checksum is calculated between random sources and targets
                Some(path) => checksum += path.get_weight(),
                None => num_not_found += 1,
            }
        }
        println!(
            "total query time .................. {} ms",
            time.elapsed_ms()
        );
        println!(
            "query time on average ............. {} micros",
            time.elapsed().as_micros() / (num_queries as u128)
        );
        assert_eq!(expected_checksum, checksum, "invalid checksum");
        assert_eq!(
            expected_num_not_found, num_not_found,
            "invalid number of paths not found"
        );
}

What would be the advantages of this ?

The biggest advantage I see is that it makes it easy to generate plots like these, so we don't have to be limited to only computing the averages of time it takes to compute the paths, we can instead plot distributions of times. Plus, the move is fairly easy and involves very little change.

And no doubt the average of computation time of a thousand paths is a useful metric, but the problem is that it is restricted, i.e. it is stable only for smaller graphs. Say for the graph of western Europe (the one used in the paper I referred to), it is only the first line of defense, and therefore the whole discussion about dijkstra ranks in the paper. Such benchmarks are also useful to have some visibility into the run times for different distances between nodes, when, say, if we ever decide to implement different 'versions' of CH (Customizable CH maybe?), or even if not that, it would make it much easier for the users to see why do some metrics work and others don't for CHs. CH shouldn't be viewed as a technique restricted to road graphs.

[Meta: I don't know what scope you have in mind for the project, IMHO feature parity with C++ libraries like RoutingKit is definitely desirable and should be targeted, to make the crate an attractive alternative for developers.. just my two cents].

About the test framework, sure, I will start working on it as soon as my schedule permits. Here's my tentative wishlist:

  1. It should be simpler to benchmark different graphs and not just the ones included with the code.
  2. Path computation times should be reported for different dijkstra's ranks.
  3. Use a desktop-class processor to run the benchmarks.
  4. Replace the used heap with a d-ary heap. this and this

Since this discussion is going beyond the scope of this issue, I guess it should be closed.

@PayasR PayasR closed this as completed Aug 15, 2019
@easbar
Copy link
Owner

easbar commented Aug 15, 2019

But aren't the paths between random sources and destinations? If so, how is the checksum constant for different times the test is run?

Yes random sources and destinations, but the seed is fixed for the benchmarks:

let mut rng = create_rng_with_seed(seed);

The biggest advantage I see is that it makes it easy to generate plots like these, so we don't have to be limited to only computing the averages of time it takes to compute the paths, we can instead plot distributions of times. Plus, the move is fairly easy and involves very little change.

Sounds good to me. If there is an easy way to get more insights this would certainly be useful.

but the problem is that it is restricted, i.e. it is stable only for smaller graphs

Yes sure, gaining insight on a more detailed level would be interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants