Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special chars Node naming making get_subgraph() access complicated #246

Open
florian0410 opened this issue Nov 25, 2020 · 4 comments
Open

Comments

@florian0410
Copy link

florian0410 commented Nov 25, 2020

SUMMARY
Special Nodes names quoted, making get method complicated to use

ISSUE TYPE
help_wanted, details-needed

COMPONENT NAME
quote_if_necessary()
get_subgraph()

ADDITIONAL INFORMATION

Hello and thank you for maintaining this project.

During my developments, I ran into an issue that bother me.

After creating a cluster using the Cluster class,

group = "SAP+"
main_graph.add_subgraph(pydot.Cluster(graph_name=group,
                                label=group))

The complex name attribute value (SAP+ here) is being quoted by quote_if_necessary() method.

My issue come after, at some point in my code, I try to access my created subgraph using get_subgraph() method. But when I'm using the base name I used to initialize it, I cannot find it back because it already have its quotes.

subg = main_graph.get_subgraph("cluster_"+group)

(Pdb) subg
[]
(Pdb) main_graph.get_subgraphs()[0].get_name()
'"cluster_SAP+"'

(Pdb) main_graph.get_subgraph('"cluster_SAP+"')
[<pydot.Subgraph object at 0x7fd6597e7700>]

To fix this, I have to format specifically the name and add some exceptions:

subg = main_graph.get_subgraph(group)
        # Case it's a special chars name
        if not subg:
            main_graph.get_subgraph('"{0}"'.format(group))
        # In case it's cluster (most of the cases ^^)
        if not subg:
            subg = main_graph.get_subgraph("cluster_"+group)
            # Second case when quoted name (SAP+ for ex...)
            if not subg:
                subg = main_graph.get_subgraph('"{0}{1}"'.format("cluster_", group))

I'm not sure we can say it's a bug but it's very inconvenient for accessing easily to the subgraphs.

Is there a better way to access this subgraph ? Should the method be enhanced/fixed to support this specific case ?

The real root cause looks like a mix-up of language to me.

Regards

@peternowee
Copy link
Member

peternowee commented Nov 28, 2020

Hi @florian0410, thanks for reporting. This is indeed very inconvenient. If with "mix-up of language" you mean the mixing of the name with the DOT quotes, I fully agree with you that that is the root cause. I am considering to separate the two by internally storing DOT IDs such as subgraph/cluster names without quotes and, if necessary, storing any quoting style information in a new pydot attribute, hopefully starting with pydot 2.0.0. See my comments in the last two paragraphs of #72 (comment). In that regard, your report can be considered a duplicate of #72, except that that one starts with the parser and this issue starts with pydot.Cluster(). Issue #180 can be considered another duplicate, even though it is about pydot.Edge() and pydot.Node(). They all come back to the problem that pydot internally stores IDs possibly wrapped in DOT quotes.

Do you also have an issue with the cluster_ prefix? I have not really given much thought to that yet. For example, would you expect Graph.get_subgraph() to have an option to automatically search for names prefixed by cluster_ as well? Or would a Graph.get_cluster() method help you? Or should we do the same here as I plan to do with quotes: Keep the cluster_ prefix out of the internally stored name, in a separate pydot attribute and only add the cluster_ prefix when writing out a DOT string for example? I am not sure about all this yet, just asking your opinion. The difference between the quotes and the cluster_ prefix is that the cluster_ prefix is really part of the DOT ID, so not immediately storing that as part of the name might result in a clash later if someone adds a subgraph with that prefix added manually.

Is there a better way to access this subgraph ?

I would not know immediately. If you are not using any of pydot's cluster-specific functionality (e.g. Cluster.set_pencolor()), maybe you could add the cluster_ prefix manually and use pydot.Subgraph() instead of pydot.Cluster(). But you will still have the quoting issue then.

Or maybe instead of main_graph.add_subgraph(pydot.Cluster(graph_name=group, label=group)), you could first assign the new subgraph or cluster to a variable (new_cluster = pydot.Cluster(graph_name=group, label=group)), then query that to see what name pydot gave it (final_cluster_name = new_cluster.get_name() returns '"cluster_SAP+"', so with prefix and quotes). Only then you add it to the graph (main_graph.add_subgraph(new_cluster)). Then later, when you need to lookup the subgraph/cluster by name, you search for it by the final name you stored (main_graph.get_subgraph(final_cluster_name) returns [<pydot.Subgraph object at 0x7f4ba11c3e80>]). But I can imagine it is also not very convenient to have to store all those final names somewhere in your data model again.

If we succeed to drop the quoting from the name fields in pydot 2.0.0, you should at least be able to drop the two checks for quoted alternatives from your code. I don't know about the separate check for cluster_ yet.

Should the method be enhanced/fixed to support this specific case ?

For the quoting issue, I don't think Graph.get_subgraph() needs any change if we can succeed to drop the quotes from the name field in pydot 2.0.0. For the cluster_ prefix, I don't know yet, I have to think about this some more. Hope to hear your thoughts as well.

@peternowee
Copy link
Member

peternowee commented Nov 29, 2020

@florian0410

Is there a better way to access this subgraph ?

For now (pydot 1.4.x), you might also be able to save checks for quoted and unquoted by calling pydot.quote_if_necessary() inline. For example:

subg = main_graph.get_subgraph(pydot.quote_if_necessary(group))
if not subg:
    subg = main_graph.get_subgraph(pydot.quote_if_necessary('cluster_{0}'.format(group)))

@florian0410
Copy link
Author

Hello Peter and thank you for your answers,

I read the issue and it definitely is a dupplicate of #72 as you said with the cluster case here. Didn't see that one before ^^

About the cluster_ prefix, yes I have some trouble using it for the reasons you mentionned, since it's a specific name to Graphviz, it's complex to resolve and require knowledge about the tool to debug it.

I think that a get_subgraph function with an option to search for clusters could help a bit but partly since it would add the prefix I suppose.

Your idea of a separate pydot attribute is the same idea that came to my mind when encountering the issue, writing at the end the prefix sound like the best solution to me. For the clash issue, I have the same concern than you, but since there is an already existing Subgraph class, if someone added manually this to the name before, he may use the Subgraph class instead, even today (not convinced yet, but sounds like reasonnable...) ?

Again thank you for your answers, I'll take a closer look to these ways, particularly on the quote_if_necessary method that should simplify things.

And yes for the post errors I mixed up some experiences, I juste edited the post to fix that.

Regards

@peternowee
Copy link
Member

Ok, I will keep this issue open now then for two reasons:

  1. To make sure that when we solving the quoting issue in general, we also check if it solved it for pydot.Cluster(). General comments on the storing of DOT quotes should normally go to Quotes included in the node's name #72, not here.
  2. Consider if we should change storing the cluster_ prefix as part of the name, how we can make it easier to find subgraphs that are clusters by their name with or without prefix, etc. More thoughts on that are welcome here.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants