Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add random_ordered_tree and forest_str #4294

Merged
45 changes: 45 additions & 0 deletions networkx/generators/random_graphs.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
"random_powerlaw_tree",
"random_powerlaw_tree_sequence",
"random_kernel_graph",
"random_ordered_tree",
]


Expand Down Expand Up @@ -1293,3 +1294,47 @@ def my_function(b):
j = int(math.ceil(n * kernel_root(i / n, j / n, r)))
graph.add_edge(i - 1, j - 1)
return graph


@py_random_state(2)
def random_ordered_tree(n, seed=None, directed=False):
Erotemic marked this conversation as resolved.
Show resolved Hide resolved
"""
Creates a random ordered tree

Parameters
----------
n : int
A positive integer representing the number of nodes in the tree.

seed : integer, random_state, or None (default)
Indicator of random number generation state.
See :ref:`Randomness<randomness>`.

directed : bool
if the edges are one-way

Returns
-------
networkx.OrderedDiGraph | networkx.OrderedGraph

Example
-------
>>> import networkx as nx
>>> assert len(nx.random_ordered_tree(n=1, seed=0).nodes) == 1
>>> assert len(nx.random_ordered_tree(n=2, seed=0).nodes) == 2
>>> assert len(nx.random_ordered_tree(n=3, seed=0).nodes) == 3
>>> otree = nx.random_ordered_tree(n=5, seed=3, directed=True)
>>> print(nx.forest_str(otree))
╙── 0
└─╼ 1
└─╼ 4
├─╼ 2
└─╼ 3
"""
from networkx.utils import create_py_random_state

rng = create_py_random_state(seed)
Erotemic marked this conversation as resolved.
Show resolved Hide resolved
# Create a random undirected tree
create_using = nx.OrderedDiGraph if directed else nx.OrderedGraph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You almost certainly don't need an OrderedGraph class here. That would be needed for Python version pre-3.6. But with Python3.6+ the regular classes are ordered because the underlying dict is ordered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is likely being removed as it is redundant with the new create_using arg in random_tree.

While dictionaries did becomes ordered by default in 3.6, that guarantee is still not part of the Python language spec AFAIK. I would much prefer to be explicit about the orderedness of graphs. Furthermore, there are mathematical implications of having an ordered graph versus one that has an arbitrary order. The algorithms in the PRs that build on this actually depend on the graph having a defined order for correctness and runtime analysis. I know there are randomized algorithms that depend on random orderings; I'm not sure if any rely on arbitrary orderings.

I'm stressing this to make sure the maintainers know there is value in explicitly using and keeping Ordered Graph/DiGraph/etc... variants in the API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that comment.
Also, CPython 3.7+ guarantees the order of dicts to be the order keys are added. I guess if you try to use Pypy that is not guaranteed, but they are still on version2.7 and our current code probably is starting to break on that version.

Copy link
Contributor Author

@Erotemic Erotemic Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW pypy has a 2.7 and a 3.6 version. I imagine order will be added to the Python dict spec in the future. (I now see you are saying that it was added in 3.7 into the language spec).

Regardless, my main concern is that networkx maintains the ability to distinguish between a graph that is ordered and a graph that has arbitrary order for algorithmic purposes (even if that's not the implementation under the hood).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only testing on PyPy 3.6 now. I am waiting on this PR actions/setup-python#168 to be merged and I will switch us to PyPy 3.7. They have been working on getting the PyPy 3.7 support regularly for the last few weeks. So I expect this to happen soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have only used OrderedGraph for testing in the past. I intended to propose deprecating it once we drop Python 3.6, which will happen very soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful about claiming alignment between OrderedGraph and a (mathematical) ordered graph. I don't think it is accurate to use these classes to distinguish between "a graph that is ordered and a graph that has arbitrary order for algorithmic purposes". Certainly it is important to distinguish them mathematically. But both OrderedGraph and Graph have the same deterministic node and edge reporting order based on the order that nodes and edges were added to the object.

We should deprecate OrderedGraph, though it might be useful to show an example of subclassing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern is that the algorithm I implemented in #4327 explicitly requires an OrderedDiGraph, and I do check for that. But this is just an input type check, and I suppose given that a DiGraph is ordered under the hood, the algorithm will still work. I'm worried that without this check a user might think that they are getting the maximum embedding of unordered graphs (which is an APX-hard problem, and the algo I wrote does not find that). I recognize that functionally they are equivalent. I just believe that the semantics are important for implementation elegance.

That's my main argument for keeping ordered graph variants, and if others don't think its strong enough to justify its existence, then I'll accept that.

Regardless, this function and all references to the OrderedDiGraph and OrderedGraph class no longer exist in this PR. So perhaps we can move forward with merging this and save the deprecation discussion for another issue.

otree = nx.random_tree(n, seed=rng, create_using=create_using)
return otree
66 changes: 62 additions & 4 deletions networkx/generators/trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ def _helper(paths, root, B):
# > method of generating uniformly distributed random labelled trees.
#
@py_random_state(1)
def random_tree(n, seed=None):
def random_tree(n, seed=None, create_using=None):
"""Returns a uniformly random tree on `n` nodes.

Parameters
Expand Down Expand Up @@ -184,11 +184,69 @@ def random_tree(n, seed=None):
*n* nodes, the tree is chosen uniformly at random from the set of
all trees on *n* nodes.

Example
-------
>>> import networkx as nx
Erotemic marked this conversation as resolved.
Show resolved Hide resolved
>>> tree = nx.random_tree(n=10, seed=0)
>>> print(nx.forest_str(tree, sources=[0]))
╙── 0
├── 3
└── 4
├── 6
│   ├── 1
│   ├── 2
│   └── 7
│   └── 8
│   └── 5
└── 9

>>> import networkx as nx
Erotemic marked this conversation as resolved.
Show resolved Hide resolved
>>> tree = nx.random_tree(n=10, seed=0, create_using=nx.OrderedDiGraph)
>>> print(nx.forest_str(tree))
╙── 0
├─╼ 3
└─╼ 4
├─╼ 6
│   ├─╼ 1
│   ├─╼ 2
│   └─╼ 7
│   └─╼ 8
│   └─╼ 5
└─╼ 9
"""
if n == 0:
raise nx.NetworkXPointlessConcept("the null graph is not a tree")
# Cannot create a Prüfer sequence unless `n` is at least two.
if n == 1:
return nx.empty_graph(1)
sequence = [seed.choice(range(n)) for i in range(n - 2)]
return nx.from_prufer_sequence(sequence)
utree = nx.empty_graph(1)
else:
sequence = [seed.choice(range(n)) for i in range(n - 2)]
utree = nx.from_prufer_sequence(sequence)

if create_using is None:
tree = utree
else:
# TODO: maybe a tree classmethod like
# Graph.new, Graph.fresh, or something like that
def new(cls_or_self):
if hasattr(cls_or_self, "_adj"):
# create_using is a NetworkX style Graph
cls_or_self.clear()
self = cls_or_self
else:
# try create_using as constructor
self = cls_or_self()
return self

tree = new(create_using)
Erotemic marked this conversation as resolved.
Show resolved Hide resolved
Erotemic marked this conversation as resolved.
Show resolved Hide resolved
if tree.is_directed():
# Use a arbitrary root node and dfs to define edge directions
edges = nx.dfs_edges(utree, source=0)
else:
edges = utree.edges

# Populate the specified graph type
tree.add_nodes_from(utree.nodes)
tree.add_edges_from(edges)

return tree
1 change: 1 addition & 0 deletions networkx/readwrite/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@
from networkx.readwrite.gexf import *
from networkx.readwrite.nx_shp import *
from networkx.readwrite.json_graph import *
from networkx.readwrite.text import *