Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random_ordered_tree and forest_str #4294

Merged

Conversation

Erotemic
Copy link
Contributor

@Erotemic Erotemic commented Oct 28, 2020

The purpose of this PR is to simplify the review of #4327. It separates the new self-contained developed for that branch random_ordered_tree which like it sounds make a random ordered tree, and forest_str, which makes a text-based representation of a directed or undirected forest.

After this is finalized and accepted, I'll rebase #4327 on top of master/main. Reviewing these two features should be much easier and make tackling #4327 more feasible.

This PR depends on #4326 has no dependencies.

This PR is depended on by #4327 and #4350

@Erotemic
Copy link
Contributor Author

Erotemic commented Nov 8, 2020

I made a few changes.

First a bugfix: I noticed that forest_str was printing ordered trees in the incorrect order. I fixed this by pushing children onto the stack in reverse order so they are popped in the correct order.

Second, in the case where forest_str is given an undirected graph, there needs to be logic to choose which nodes to use as "source" nodes. I was hesitant to use nx.connected_components because I wanted the function to be able to work an in "eager-evaluation" mode, which is stiffed if there is a linear complexity preprocessing step. However, I realized that determining the sources for the directed case and adding all sources to the stack is also linear complexity, so it really doesn't matter.

Lastly, I changed the name of use_labels to with_labels to better agree with nx.draw.

@Erotemic Erotemic force-pushed the dev/random_otree_and_forest_str branch 2 times, most recently from 915bf21 to 384b416 Compare November 8, 2020 18:18
Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

networkx/readwrite/tests/test_text.py Show resolved Hide resolved
networkx/readwrite/tests/test_text.py Outdated Show resolved Hide resolved
@Erotemic Erotemic force-pushed the dev/random_otree_and_forest_str branch from 3057a64 to bbdf8fe Compare November 14, 2020 23:40
@Erotemic
Copy link
Contributor Author

I was working with this module in other PRs, and there were a few changes that might make using this module easier.

In networkx.generators I put random_ordered_tree in random_graphs, but perhaps it would be better in the generators.trees module.

I also thought maybe it would be easier if the nx.random_tree function simply took a create_using argument (similar to other places in nx). I made a commit that adds that to nx.random_tree.

Should random_ordered_tree even exist if nx.random_tree now has a create_using kwarg? I left it in for now, but its easy enough to remove.

I did make one slight functionality change. Previously if you requested a directed graph, it would use dfs starting from an random node as the source. Now, I'm always using the node 0 as a source. It makes the function simpler, and the user can always relabel if they want random labels too.

@Erotemic Erotemic force-pushed the dev/random_otree_and_forest_str branch from 39fd7d5 to 0649ebd Compare November 15, 2020 19:08
@Erotemic
Copy link
Contributor Author

After making changes to nx.random_tree to include the create_using argument, it looks like the doc build step fails because I added a doctest that used utf8 chars. This means that this PR now depends on PR #4326 where I add xetex support to docs.

This should be merged only after #4326 is merged.

@Erotemic Erotemic force-pushed the dev/random_otree_and_forest_str branch from 0649ebd to e518a7d Compare December 3, 2020 19:14
@Erotemic
Copy link
Contributor Author

Erotemic commented Dec 3, 2020

#4326 was merged. I rebased on master, so this PR now has no dependencies.

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a few big-picture comments from a quick review here. The PR has some really nice features in it. I had a couple questions/comments about API and platform support. LMK what you think!

networkx/generators/random_graphs.py Outdated Show resolved Hide resolved
networkx/generators/trees.py Outdated Show resolved Hide resolved
networkx/readwrite/text.py Outdated Show resolved Hide resolved

rng = create_py_random_state(seed)
# Create a random undirected tree
create_using = nx.OrderedDiGraph if directed else nx.OrderedGraph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You almost certainly don't need an OrderedGraph class here. That would be needed for Python version pre-3.6. But with Python3.6+ the regular classes are ordered because the underlying dict is ordered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is likely being removed as it is redundant with the new create_using arg in random_tree.

While dictionaries did becomes ordered by default in 3.6, that guarantee is still not part of the Python language spec AFAIK. I would much prefer to be explicit about the orderedness of graphs. Furthermore, there are mathematical implications of having an ordered graph versus one that has an arbitrary order. The algorithms in the PRs that build on this actually depend on the graph having a defined order for correctness and runtime analysis. I know there are randomized algorithms that depend on random orderings; I'm not sure if any rely on arbitrary orderings.

I'm stressing this to make sure the maintainers know there is value in explicitly using and keeping Ordered Graph/DiGraph/etc... variants in the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that comment.
Also, CPython 3.7+ guarantees the order of dicts to be the order keys are added. I guess if you try to use Pypy that is not guaranteed, but they are still on version2.7 and our current code probably is starting to break on that version.

Copy link
Contributor Author

@Erotemic Erotemic Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW pypy has a 2.7 and a 3.6 version. I imagine order will be added to the Python dict spec in the future. (I now see you are saying that it was added in 3.7 into the language spec).

Regardless, my main concern is that networkx maintains the ability to distinguish between a graph that is ordered and a graph that has arbitrary order for algorithmic purposes (even if that's not the implementation under the hood).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only testing on PyPy 3.6 now. I am waiting on this PR actions/setup-python#168 to be merged and I will switch us to PyPy 3.7. They have been working on getting the PyPy 3.7 support regularly for the last few weeks. So I expect this to happen soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have only used OrderedGraph for testing in the past. I intended to propose deprecating it once we drop Python 3.6, which will happen very soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful about claiming alignment between OrderedGraph and a (mathematical) ordered graph. I don't think it is accurate to use these classes to distinguish between "a graph that is ordered and a graph that has arbitrary order for algorithmic purposes". Certainly it is important to distinguish them mathematically. But both OrderedGraph and Graph have the same deterministic node and edge reporting order based on the order that nodes and edges were added to the object.

We should deprecate OrderedGraph, though it might be useful to show an example of subclassing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that the algorithm I implemented in #4327 explicitly requires an OrderedDiGraph, and I do check for that. But this is just an input type check, and I suppose given that a DiGraph is ordered under the hood, the algorithm will still work. I'm worried that without this check a user might think that they are getting the maximum embedding of unordered graphs (which is an APX-hard problem, and the algo I wrote does not find that). I recognize that functionally they are equivalent. I just believe that the semantics are important for implementation elegance.

That's my main argument for keeping ordered graph variants, and if others don't think its strong enough to justify its existence, then I'll accept that.

Regardless, this function and all references to the OrderedDiGraph and OrderedGraph class no longer exist in this PR. So perhaps we can move forward with merging this and save the deprecation discussion for another issue.

Copy link
Member

@jarrodmillman jarrodmillman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than removing the imports from the doctest, this LGTM. Thanks!

networkx/generators/trees.py Outdated Show resolved Hide resolved
networkx/generators/trees.py Outdated Show resolved Hide resolved
networkx/readwrite/text.py Outdated Show resolved Hide resolved
@jarrodmillman jarrodmillman merged commit dc834d3 into networkx:master Dec 18, 2020
@jarrodmillman jarrodmillman added this to the networkx-2.6 milestone Dec 18, 2020
@rossbar rossbar mentioned this pull request May 26, 2021
MridulS pushed a commit to MridulS/networkx that referenced this pull request Feb 4, 2023
* Add random_ordered_tree and forest_str

* Update networkx/generators/trees.py

I think this makes sense.

Co-authored-by: Dan Schult <dschult@colgate.edu>

* Remove random_ordered_tree

* Fix bug in random_tree when n=1 and create_using is specified

* Add ascii_only mode to forest_str

* fix test

* wip

* Fix comments

* Typo in ascii glyph

* Remove reference to OrderedDiGraph

* Removed imports in doctests

Co-authored-by: Dan Schult <dschult@colgate.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

4 participants