Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GraphStorage trait and Graph type #583

Merged
merged 115 commits into from
Oct 20, 2023
Merged

GraphStorage trait and Graph type #583

merged 115 commits into from
Oct 20, 2023

Conversation

indietyp
Copy link
Member

@indietyp indietyp commented Sep 3, 2023

This crate introduces my new vision for how the old traits will be replaced in 0.7.0.

This idea has been inspired by the concept outlined in #563 (thank you @pnevyk for the idea!).

This has been on my mind for quite some time, but after #561 has been merged, I finally got around to creating the PR.

The Problem

The current problem (as outlined in #551) is quite a handful and includes that there are too many traits and that the implementation of those traits varies widely between implementations. There are just too many to keep track of as a consumer and an implementor.

To mitigate this, a new set of traits has been introduced.

The solution

Instead of having specialized traits for specific operations, we try to combine them.

The idea is simple: The amount of traits a graph needs to be useful (and to fulfill being a graph) is plentiful and there is a large set of overlap. Having too many traits is both restrictive to the implementor (which then choose to implement algorithms on specific implementations) and confusing for the consumer (which on is the right one). It also means a lot of work and things to get right for the end consumer.

The solution instead is a combination of two things: remove traits for things that were traits in favor of concrete types as well as consolidation.

Now a storage backend needs to implement the following traits:

  • GraphStorage (mandatory)
  • DirectedGraphStorage (optional)
  • ResizableGraphStorage (optional)
  • RetainableGraphStorage (optional)
  • GraphStorageAdjacencyMatrix (optional)
  • DirectedGraphStorageAdjacencyMatrix (optional)

These types additionally offer default implementations for as much as possible to make implementing them a lot easier (although speed might not be top-notch)

I am looking to consolidate/remove the *AdjacencyMatrix traits.

The following types replace traits:

  • Node
  • NodeMut
  • DetachedNode
  • Edge
  • EdgeMut
  • DetachedEdge
  • AdjacencyMatrix

The IndexType has been removed in favor of the traits:

  • GraphId
  • ManagedGraphId
  • ArbitraryGraphId

A core goal of this unification was to elevate GraphMap (and similar storage solutions in the future) from a second-class citizen to a fully-fledged part of the library.

To be able to accommodate this (and future changes), insert_node now takes Attributes, which for ManagedGraphId has the signature Attributes::new(weight: S::NodeWeight) and for ArbitraryGraphId has the signature Attributes::new(id: impl Into<S::NodeId>, weight: S::NodeWeight).

Due to the generic nature of these IDs, they will need to be wrapped in some type (I think this a feature), so the graph map will have to have something like Key<T>, but the implementation of Attributes::new allows for easy interoperability.

The storage is then wrapped in a new type: Graph<S>, which exposes all methods and unifies all traits; this allows for easier usage and extensions in the future.

Algorithms now simply need to specify: bellman_ford<S>(graph: &Graph<S>, start: &S::NodeId) where S: GraphStorage, S::EdgeWeight: FloatMeasure, instead of pub fn bellman_ford<G>(g: G, source: G::NodeId) where G: NodeCount + IntoNodeIdentifiers + IntoEdges + NodeIndexable, G::EdgeWeight: FloatMeasure, making it more apparent to the end user (and implementor) what the actual intention is.

I have chosen against splitting node and edge storage as both are inherently linked in graphs.

This change also has the benefit that we don't need to test an algorithm for all possible implementations; they just need to test their GraphStorage implementation instead.

The implementation

To implement the traits and check for feasibility, this PR also adds (and tests) a new (private) crate, petgraph-dino. I will be doing some benchmarks in a later PR (also to compare different implementations) and am hoping that with some tweaks, petgraph-dino can be one of the possible implementations, if not the default one (The current concern is with memory usage, although I have not yet verified this through benchmarks!).

The idea is for core and other crates to use petgraph-dino as the test implementation; this way one will not need to pull in a specific graph crate for testing or the whole petgraph crate.

Questions

I will be working on this a bit before marking it ready, but if anything is unclear, please let me know, and I'll try to answer them.

@indietyp indietyp marked this pull request as ready for review October 17, 2023 12:44
@indietyp
Copy link
Member Author

Finally, this is done!

You may ask yourself: why did this take so long?

The answer is simple, although it might not be self-evident:

  1. Experimentation. Finding the proper API takes time, which means experimenting (even if it never comes into a commit).
  2. Though and care: every API decision has been carefully studied and considered. How exactly does this design impact the user? How elegant is this solution? Do you think this is obvious? What are the limitations of the design? Is the name clear? This took quite some time (and nerves), and I am thankful to @thehabbos007 for lending me an ear in these times and coming up with clear new perspectives.
  3. Documentation. Every API surface has been documented, like everything; it took some time, but it is done. petgraph-core now has over 155 examples, pretty much for every method, which should (hopefully) make things more straightforward to beginners.

Overall, I am happy with the result so far, and with some experimentation in another branch, I confirmed that it makes writing things like algorithms quite a bit easier.

A side product of this PR is the new petgraph-dino crate, first, simply an implementation of GraphStorage for testing purposes it has elevated itself (in my mind) to become the default implementation, hopefully.

There are also two new crates, one of which is unlikely to survive, petgraph-utils for utilities, such as a macro to create graphs conveniently, and petgraph-test-utils a crate that will be swallowed into petgraph-utils in the coming PRs.

The idea for petgraph-utils is to also provide a test harness through a macro (which is to be pioneered in petgraph-dino) to be able to automatically test large chunks of graph implementations with the same code.

I hope that PRs will come more often (and are less massive) after this change.

Like with any other PR, I will have this open until the end of the week (Friday or Sunday) and will wait for any suggestions or reviews; if none of those will happen, I will merge it by then to keep the train moving.

Be sure to check out the branch and create the documentation for petgraph-core or petgraph-dino; it is something I am pretty proud of.

@indietyp indietyp merged commit 1240c9c into next Oct 20, 2023
1 of 5 checks passed
@indietyp indietyp deleted the bm/graph-storage branch October 20, 2023 13:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant