Distributed Sampling in cuGraph-PyG #4384

alexbarghi-nv · 2024-05-01T17:42:18Z

Distributed sampling in cuGraph-PyG. Also renames the existing API to clarify that it is dask based.
Adds a dependency on tensordict for cuGraph-PyG which supports the new TensorDictFeatureStore.
Also no longer installs torch-cluster and torch-spline-conv in CI for testing since that results in an ImportError and neither of those packages are needed.

Requires PyG 2.5. Should be merged after #4335

Merge after #4355

Closes #4248
Closes #4249
Closes #3383
Closes #3942
Closes #3836
Closes #4202
Closes #4051
Closes #4326
Closes #4252
Partially addresses #3805

…rameters

…to bug_post_processing

alexbarghi-nv · 2024-05-17T20:27:55Z

python/cugraph-pyg/cugraph_pyg/examples/README.md

The content of this file was migrated to graph_sage_mg.py

alexbarghi-nv · 2024-05-17T20:28:25Z

python/cugraph-pyg/cugraph_pyg/examples/cugraph_dist_sampling_sg.py

CSR is the fastest compression based on benchmarking

alexbarghi-nv · 2024-05-17T20:29:26Z

python/cugraph-pyg/cugraph_pyg/examples/gcn_dist_snmg.py

The MNMG example will be added along with the new WG feature store, and will use the new WG feature store to avoid replicating data across processes.

…nto pyg-dist

tingyu66

Approved with some comments and questions. The overall logic looks reasonable and trust you with matching the detailed interface of PyG's native Graph- and FeatureStore.

python/cugraph-pyg/cugraph_pyg/examples/gcn_dist_snmg.py

tingyu66 · 2024-05-29T01:00:20Z

python/cugraph-pyg/cugraph_pyg/data/graph_store.py

+                sz = torch.tensor(num_vertices[vtype], device="cuda")
+                torch.distributed.all_reduce(sz, op=torch.distributed.ReduceOp.MAX)
+                num_vertices[vtype] = int(sz)
+        return num_vertices


Should we cache num_vertices and num_edges since they require collective communications?

We cache the offsets, which are more frequently used, but I might add caching the vertex and edge counts if they are used more often. So maybe in the future.

python/cugraph-pyg/cugraph_pyg/data/graph_store.py

…nto pyg-dist

BradReesWork · 2024-05-29T20:28:52Z

/merge

alexbarghi-nv · 2024-05-29T21:16:39Z

CI failed with bus error again, I think due to OOM. I just pushed a commit that skips that test.

seunghwak and others added 30 commits April 2, 2024 17:28

fix cosmetic issues

2ed2fd6

update sampling post processing functions to take additional input pa…

e16c0c2

…rameters

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

6ac372f

…to bug_post_processing

fix tests for pyg 2.5

f4bcf72

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

242d9ff

…to bug_post_processing

update renumbering to consider seed vertices

e13ec05

update tests

18e199e

update documentation

fd6dd23

clagn-format

d1bdcbb

improve documentation & input argument checking

586899d

clang-format

633346b

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

7e40d26

…to bug_post_processing

bug fix in the C-API

5fb39a3

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

de92eca

…to bug_post_processing

change interface to allow EdgeIndex

2de1bad

Merge branch 'branch-24.06' into model_pyg25

acd7ea0

clean up

98faf07

bug fix

a703b39

CI experiment

7d27018

compile error fix

8b3a5bd

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

f23bdcb

…to bug_post_processing

fix

a6896c1

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

4435216

…to bug_post_processing

return perm in helpers

51be4c6

Merge branch 'branch-24.06' into model_pyg25

e4feaa9

Merge branch 'branch-24.06' into model_pyg25

2bd47f9

CI failure fix

8d209d5

Merge branch 'branch-24.06' of https://github.com/rapidsai/cugraph in…

07dcc14

…to bug_post_processing

fix tests, use _copy in tests

37f53af

C API

4285b47

rename snmg example

47c897d

alexbarghi-nv commented May 17, 2024

View reviewed changes

python/cugraph-pyg/cugraph_pyg/examples/README.md Outdated

Copy link

Member Author

alexbarghi-nv May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content of this file was migrated to graph_sage_mg.py

alexbarghi-nv commented May 17, 2024

View reviewed changes

alexbarghi-nv marked this pull request as ready for review May 17, 2024 20:31

alexbarghi-nv requested review from a team as code owners May 17, 2024 20:31

alexbarghi-nv and others added 2 commits May 17, 2024 13:34

fix sampler test

72b5d76

Update gcn_dist_snmg.py

6e54068

alexbarghi-nv mentioned this pull request May 21, 2024

[FEA] New WholeGraph Feature Store for PyG #4432

Open

alexbarghi-nv and others added 9 commits May 22, 2024 16:40

Merge branch 'branch-24.06' into pyg-dist

a41c3e8

style

3f8927c

Merge branch 'pyg-dist' of https://github.com/alexbarghi-nv/cugraph i…

e73fad5

…nto pyg-dist

install tensordict in wheel test

31090db

install tensordict in python tests

300edb8

Merge branch 'pyg-dist' of https://github.com/alexbarghi-nv/cugraph i…

a53d472

…nto pyg-dist

don't install torch-sparse, torch-spline-conv to fix ci issue

9dc757a

don't call as array on 'None'

d2fff15

Merge branch 'branch-24.06' into pyg-dist

b1c9b06

tingyu66 approved these changes May 29, 2024

View reviewed changes

BradReesWork approved these changes May 29, 2024

View reviewed changes

alexbarghi-nv and others added 3 commits May 29, 2024 13:10

Merge branch 'branch-24.06' into pyg-dist

94cc6d1

clean up snmg example

0551718

Merge branch 'pyg-dist' of https://github.com/alexbarghi-nv/cugraph i…

0cb3d27

…nto pyg-dist

AyodeAwe approved these changes May 29, 2024

View reviewed changes

skip snmg example in CI

0b2156e

actually set variable

987021d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Distributed Sampling in cuGraph-PyG #4384

Distributed Sampling in cuGraph-PyG #4384

alexbarghi-nv commented May 1, 2024 •

edited

alexbarghi-nv May 17, 2024

alexbarghi-nv May 17, 2024

alexbarghi-nv May 17, 2024

tingyu66 left a comment

tingyu66 May 29, 2024

alexbarghi-nv May 29, 2024

BradReesWork commented May 29, 2024

alexbarghi-nv commented May 29, 2024

Distributed Sampling in cuGraph-PyG #4384

Are you sure you want to change the base?

Distributed Sampling in cuGraph-PyG #4384

Conversation

alexbarghi-nv commented May 1, 2024 • edited

alexbarghi-nv May 17, 2024

Choose a reason for hiding this comment

alexbarghi-nv May 17, 2024

Choose a reason for hiding this comment

alexbarghi-nv May 17, 2024

Choose a reason for hiding this comment

tingyu66 left a comment

Choose a reason for hiding this comment

tingyu66 May 29, 2024

Choose a reason for hiding this comment

alexbarghi-nv May 29, 2024

Choose a reason for hiding this comment

BradReesWork commented May 29, 2024

alexbarghi-nv commented May 29, 2024

alexbarghi-nv commented May 1, 2024 •

edited