New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add random_ordered_tree and forest_str #4294
Add random_ordered_tree and forest_str #4294
Conversation
63ca8f3
to
16ac464
Compare
I made a few changes. First a bugfix: I noticed that Second, in the case where Lastly, I changed the name of |
915bf21
to
384b416
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
3057a64
to
bbdf8fe
Compare
I was working with this module in other PRs, and there were a few changes that might make using this module easier. In networkx.generators I put random_ordered_tree in random_graphs, but perhaps it would be better in the generators.trees module. I also thought maybe it would be easier if the Should I did make one slight functionality change. Previously if you requested a directed graph, it would use dfs starting from an random node as the source. Now, I'm always using the node 0 as a source. It makes the function simpler, and the user can always relabel if they want random labels too. |
39fd7d5
to
0649ebd
Compare
0649ebd
to
e518a7d
Compare
#4326 was merged. I rebased on master, so this PR now has no dependencies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few big-picture comments from a quick review here. The PR has some really nice features in it. I had a couple questions/comments about API and platform support. LMK what you think!
networkx/generators/random_graphs.py
Outdated
|
||
rng = create_py_random_state(seed) | ||
# Create a random undirected tree | ||
create_using = nx.OrderedDiGraph if directed else nx.OrderedGraph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You almost certainly don't need an OrderedGraph
class here. That would be needed for Python version pre-3.6. But with Python3.6+ the regular classes are ordered because the underlying dict is ordered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is likely being removed as it is redundant with the new create_using
arg in random_tree
.
While dictionaries did becomes ordered by default in 3.6, that guarantee is still not part of the Python language spec AFAIK. I would much prefer to be explicit about the orderedness of graphs. Furthermore, there are mathematical implications of having an ordered graph versus one that has an arbitrary order. The algorithms in the PRs that build on this actually depend on the graph having a defined order for correctness and runtime analysis. I know there are randomized algorithms that depend on random orderings; I'm not sure if any rely on arbitrary orderings.
I'm stressing this to make sure the maintainers know there is value in explicitly using and keeping Ordered
Graph/DiGraph/etc...
variants in the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that comment.
Also, CPython 3.7+ guarantees the order of dicts to be the order keys are added. I guess if you try to use Pypy that is not guaranteed, but they are still on version2.7 and our current code probably is starting to break on that version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW pypy has a 2.7 and a 3.6 version. I imagine order will be added to the Python dict spec in the future. (I now see you are saying that it was added in 3.7 into the language spec).
Regardless, my main concern is that networkx maintains the ability to distinguish between a graph that is ordered and a graph that has arbitrary order for algorithmic purposes (even if that's not the implementation under the hood).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are only testing on PyPy 3.6 now. I am waiting on this PR actions/setup-python#168 to be merged and I will switch us to PyPy 3.7. They have been working on getting the PyPy 3.7 support regularly for the last few weeks. So I expect this to happen soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have only used OrderedGraph
for testing in the past. I intended to propose deprecating it once we drop Python 3.6, which will happen very soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful about claiming alignment between OrderedGraph and a (mathematical) ordered graph. I don't think it is accurate to use these classes to distinguish between "a graph that is ordered and a graph that has arbitrary order for algorithmic purposes". Certainly it is important to distinguish them mathematically. But both OrderedGraph and Graph have the same deterministic node and edge reporting order based on the order that nodes and edges were added to the object.
We should deprecate OrderedGraph, though it might be useful to show an example of subclassing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is that the algorithm I implemented in #4327 explicitly requires an OrderedDiGraph, and I do check for that. But this is just an input type check, and I suppose given that a DiGraph is ordered under the hood, the algorithm will still work. I'm worried that without this check a user might think that they are getting the maximum embedding of unordered graphs (which is an APX-hard problem, and the algo I wrote does not find that). I recognize that functionally they are equivalent. I just believe that the semantics are important for implementation elegance.
That's my main argument for keeping ordered graph variants, and if others don't think its strong enough to justify its existence, then I'll accept that.
Regardless, this function and all references to the OrderedDiGraph and OrderedGraph class no longer exist in this PR. So perhaps we can move forward with merging this and save the deprecation discussion for another issue.
I think this makes sense. Co-authored-by: Dan Schult <dschult@colgate.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than removing the imports from the doctest, this LGTM. Thanks!
* Add random_ordered_tree and forest_str * Update networkx/generators/trees.py I think this makes sense. Co-authored-by: Dan Schult <dschult@colgate.edu> * Remove random_ordered_tree * Fix bug in random_tree when n=1 and create_using is specified * Add ascii_only mode to forest_str * fix test * wip * Fix comments * Typo in ascii glyph * Remove reference to OrderedDiGraph * Removed imports in doctests Co-authored-by: Dan Schult <dschult@colgate.edu>
The purpose of this PR is to simplify the review of #4327. It separates the new self-contained developed for that branch
random_ordered_tree
which like it sounds make a random ordered tree, andforest_str
, which makes a text-based representation of a directed or undirected forest.After this is finalized and accepted, I'll rebase #4327 on top of master/main. Reviewing these two features should be much easier and make tackling #4327 more feasible.
This PR
depends on #4326has no dependencies.This PR is depended on by #4327 and #4350