Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move edge triangle count to the stable API #4382

Conversation

jnke2016
Copy link
Contributor

@jnke2016 jnke2016 commented Apr 30, 2024

This PR

  1. Performs edge triangle count in chunk
  2. Enables k - 1 core optimization
  3. Add C++ tests for edge triangle count
  4. Move edge triangle count to the stable API
  5. Implement MG edge triangle count and add tests
  6. Update 'mg_graph_to_sg_graph' to support 'edge_ids' along with tests

closes #4370
closes #4371

@jnke2016 jnke2016 requested a review from a team as a code owner April 30, 2024 15:41
@alexbarghi-nv alexbarghi-nv added improvement Improvement / enhancement to an existing function breaking Breaking change and removed cuGraph labels May 1, 2024
@jnke2016 jnke2016 requested a review from a team as a code owner May 4, 2024 11:06
Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the actual implementation, just a quick first review.

cpp/include/cugraph/algorithms.hpp Outdated Show resolved Hide resolved
#include <vector>

struct EdgeTriangleCount_Usecase {
bool test_weighted_{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to include edge_masking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we need mg_edge_triangle_count_test as well (is this code MNMG ready?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to include edge_masking.

I included edge mask tests

is this code MNMG ready?

Yes it is

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the edge triangle count implementation (Haven't reviewed the K-Truss implementation updates yet).

cpp/src/community/edge_triangle_count_impl.cuh Outdated Show resolved Hide resolved
std::array<bool, 2>{true, true},
false /*FIXME: pass 'do_expensive_check' as argument*/);
size_t approx_edges_to_intersect_per_iteration =
static_cast<size_t>(handle.get_device_properties().multiProcessorCount) * (1 << 17);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't approximate, right? (I mean except for the last chunk). This approx_ naming convention is used when we are dealing with the CSR data structure and # edges should need to be aligned with the CSR offset array boundaries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean except for the last chunk

Ya or if it is bigger than the number of edges. But I agree it doesn't align with the intended context (CSR data structure). I renamed it to edges_to_intersect_per_iteration

std::optional<edge_property_view_t<edge_t, weight_t const*>>{std::nullopt},
std::optional<edge_property_view_t<edge_t, edge_t const*>>{std::nullopt},
std::optional<raft::device_span<vertex_t const>>(std::nullopt));

auto edge_first = thrust::make_zip_iterator(edgelist_srcs.begin(), edgelist_dsts.begin());

thrust::sort(handle.get_thrust_policy(), edge_first, edge_first + edgelist_srcs.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of sort here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. We no longer need this as we do not pass an edgelist to edge_triangle_count. In fact when we decompress the edges from the graph, they are already sorted.

cpp/src/community/edge_triangle_count_impl.cuh Outdated Show resolved Hide resolved
? (edgelist_srcs.size() / approx_edges_to_intersect_per_iteration)
: (edgelist_srcs.size() / approx_edges_to_intersect_per_iteration) + 1;
size_t prev_chunk_size = 0;
auto num_edges = edgelist_srcs.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_remaining_edges to be clearer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And prev_chunk_size = edgelist_srcs.size() - num_remaining_edges, so redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

num_remaining_edges to be clearer?

Done

And prev_chunk_size = edgelist_srcs.size() - num_remaining_edges, so redundant

Can you be more clear here?

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, finished the first round review.

std::nullopt,
std::nullopt,
cugraph::graph_properties_t{true, graph_view.is_multigraph()},
true);
false); // FIXME: Renumbering should not be hardcoded.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this FIXME?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if renumbering should be hardcoded but I removed the FIXME.

cpp/src/community/k_truss_impl.cuh Show resolved Hide resolved
cpp/src/community/k_truss_impl.cuh Outdated Show resolved Hide resolved
std::move(std::get<0>(vertex_pair_buffer)),
std::move(std::get<1>(vertex_pair_buffer)),
std::nullopt,
// FIXME: Update 'shuffle_int_...' to support int32_t and int64_t values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this FIXME? You can use int32_t and int64_t values as you are currently doing this.

You are calling a function that was designed to shuffle vertex pairs along with edge weight, edge id and edge type around the GPU cluster. That's not what you have, you have vertex pairs and an increase count. If this function didn't exist, since you are in a .cu file you could replace this logic with something akin to this:

auto& comm = handle.get_comms();

If your concern is that we should have a more general purpose function for shuffling vertex pairs and arbitrary attributes then be sure to specify what you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use int32_t and int64_t values as you are currently doing this.

Right. This is currently how we proceed but based on Seunghwa feedbacks on decompress_edgelist, we should have something separate from edge_id to handle this in case edge_id and count don't hold the same type. I thought the same holds for the shuffling function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm asking for a more descriptive FIXME. As worded it is not clear to me what you are asking to be changed at some point in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I updated the FIXME and it will be available in the next commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

mg_graph_to_sg_graph(
raft::handle_t const& handle,
cugraph::graph_view_t<vertex_t, edge_t, store_transposed, true> const& graph_view,
std::optional<cugraph::edge_property_view_t<edge_t, weight_t const*>> edge_weight_view,
std::optional<cugraph::edge_property_view_t<edge_t, edge_t const*>> edge_id_view,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add edge_type here, since we're modifying all of the functions? We do currently support edge weight, edge id and edge type in most of our functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BradReesWork
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 1c3f3a8 into rapidsai:branch-24.06 May 28, 2024
131 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change CMake cuGraph improvement Improvement / enhancement to an existing function
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move edge triangle count to stable api Edge Triangle Counts in chunks
6 participants