Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: dot-graph (graphviz) generation, from a running Autofac container #788

Closed
ph-hs opened this issue Aug 19, 2016 · 54 comments
Assignees
Milestone

Comments

@ph-hs
Copy link

ph-hs commented Aug 19, 2016

(co creator of PicoContainer here)

Obviously as a optional plugin for Autofac, could there be a means to instrument injections. Only produce the dotfile output, as the user can run that through the executables to make pics.

Ref - http://picocontainer.com/Old_JIRA_Issues/PICO/285/ and https://github.com/picocontainer/PicoContainer2/blob/master/pico/gems/src/java/org/picocontainer/gems/monitors/DotDependencyGraphComponentMonitor.java

You would not run this in a production system, unless you were intending to make some diagram that that showed exhaustive injections for the system, and counts for each. More likely you'd have a Selenium test suite for the whole app, and extract the dot diagram before tearing down the server.

@tillig
Copy link
Member

tillig commented Aug 19, 2016

Instrumentation is definitely an area in which we need to improve. We've tried previously to add text based logs or other similarly simple diagnostics with some limited success. An actual graphic view would be pretty cool.

Looks like it may be time to revive some of the instrumentation and perf efforts now that the .NET Core madness is mostly over.

@ph-hs
Copy link
Author

ph-hs commented Aug 19, 2016

With PicoContainer we had a a 'ComponentMonitor' (default NullComponentMonitor) :- https://github.com/picocontainer/PicoContainer2/blob/master/pico/container/src/java/org/picocontainer/ComponentMonitor.java

Methods: instantiating, instantiated, instantiationFailed, invoking (for lifecycle: start/stop/dispose), invoked, invocationFailed, lifecycleInvocationFailed, noComponentFound, newInjector, newBehavior.

@tillig
Copy link
Member

tillig commented Feb 19, 2019

A lot of time has passed on this; I'm curious if we could make use of DiagnosticSource now to raise events that could be handled by instrumentation.

@tillig tillig mentioned this issue May 21, 2020
@alistairjevans alistairjevans added this to the v6.0 milestone May 27, 2020
@tillig
Copy link
Member

tillig commented Jun 25, 2020

I was thinking about picking this up now that v6 has some good instrumentation. I'm curious about how we would expect it to work. I figure there are two logical ways:

  1. The graph is built for the whole container. The graph generator keeps an abbreviated in-memory tree of what depends on what for the entire run of the container. The "root" of the tree is the container itself. The graph is rendered on dispose of the container or on manual execution of some sort of "render" method.
  2. One graph is built per resolve operation. Similar to the existing trace mechanism, where each trace scope is a single resolve through the complete dependency chain, that's what happens here, just with a dot-graph format.

I think the first option is interesting but not valuable. The amount of data in that graph would likely be overwhelming and confusing, plus it gets really complicated when you get into things like per-request and named lifetime scopes.

I think the second option is likely where we want to be, so that's where I'll start.

@tillig tillig self-assigned this Jun 25, 2020
@alistairjevans
Copy link
Member

Perhaps if we designed this so that the 'scope' of the graph can be attached to an general async context, we can do interesting things like graph an entire ASP.NET request by adding some support to the .net core integration?

Ideally, resolving a single controller would resolve the entire graph, but that's not always the case, and indeed those edge cases are harder to visualise/diagnose.

That would strike me as potentially valuable.

@tillig
Copy link
Member

tillig commented Jun 25, 2020

So, right now, if I adapt the DefaultDiagnosticTracer to something that writes GraphViz, it wouldn't catch the full resolve op? Hmmm. Not to be a pain, but can you elaborate on:

but that's not always the case

What exactly does that mean? (I'll likely try to turn that sort of edge case into a test so I can be sure to handle it.)

@alistairjevans
Copy link
Member

The default tracer will track the full resolve op, provided everything in the graph resolves from the associated IComponentContext.

However, any manual calls to ILifetimeScope.Resolve would trigger a new ResolveOperation, so would have separate trace data (that's the edge case).

You could overcome this with associating all resolves that happen in a given async context together perhaps.

@tillig
Copy link
Member

tillig commented Jun 25, 2020

I see, so things like Func<T> relationships would be lost because they're sorta technically service location [when it comes down to it]. That makes sense. Let me noodle on that.

@alistairjevans
Copy link
Member

Yes, possibly Owned<> too?

All the information will be provided to you, it's just associating it that's fun.

Since you can add a tracer at a per-lifetime-scope level (which would then inherit down to any Owned scopes) it would be possible to have a different tracer instance per-request scope and associate everything that way possibly? 🤷‍♂️

@tillig
Copy link
Member

tillig commented Jun 25, 2020

This basically means we'd need to figure out how to ensure this can be traced:

public class Consumer
{
  private Func<IDependency> _factory;
  public Consumer(Func<IDependency> factory)
  {
    this._factory = factory;
  }

  public void DoWork()
  {
    // Technically not part of the actual resolve, and conditional
    // based on whether this method even gets called.
    var dep = factory();
    dep.DoWork();
  }
}

public interface IDependency
{
  void DoWork();
}

public class Dependency : IDependency
{
  public void DoWork() { }
}

var builder = new ContainerBuilder();
builder.RegisterType<Dependency>().As<IDependency>();
builder.RegisterType<Consumer>();
var container = builder.Build();

// Track both the use of Consumer and the
// deferred resolve of IDependency
using var scope = container.BeginLifetimeScope();
var consumer = scope.Resolve<Consumer>();
consumer.DoWork();

One of the big problems is going to be knowing when the trace is done. In the above case, there could be any number of lines and code pathways through the program before calling DoWork()...

var consumer = scope.Resolve<Consumer>();
while(someCondition)
{
  // Lots of stuff, maybe some resolves of Owned<T> items,
  // maybe more service location, possibly some creation
  // and resolve of child lifetime scopes...
  using var unitOfWork = scope.BeginLifetimeScope();
  var wtf = unitOfWork.Resolve<Func<Owned<Consumer>>>();
  wtf().Value.DoWork();
}
consumer.DoWork();

As I think about it, it may be useful for the tracer to know the lifetime scope to which it's attached. The tracer could subscribe to the lifetime scope dispose event and that could be a signal to dump its trace. The only time the trace would be "done" in ASP.NET Core would be when the request lifetime scope is disposed.

Having a per-request sort of trace may not produce the results we think it would. There are a lot of things outside the controller that also get resolved during the request - framework stuff like controller activators and crap. That might generate a lot of noise and reduce the value of a graph like that.

I'm also wondering what happens if someone attaches this sort of tracer to the container - that would actually generate a graph of every resolve for the lifetime of the container. And maybe that's what folks want...?

So, now I'm thinking:

  • The tracer needs to know which lifetime scope it's attached to initially. Perhaps this means there's a callback during ILifetimeScope.AttachTracer to call, like, IResolvePipelineTracer.Attach(ILifetimeScope).
  • The tracer needs to know when the lifetime scope it's attached to is being disposed. Possibly that's a corresponding IResolvePipelineTracer.Detach(ILifetimeScope) call during lifetime scope disposal. I've seen similar attach/detach patterns in WCF for extensions.
  • The tracer starts a trace on attach, stops and dumps the trace on detach.
  • The tracer keeps some sort of in-memory map of what's calling/resolving what so it can build the tree as needed.

I'm not sure what happens if someone attaches/replaces the tracer in the middle of a trace. I guess it just breaks the trace chain at that point, which is probably OK.

using var scope1 = container.BeginLifetimeScope();
scope1.AttachTracer(new GraphVizTracer());
using var scope2 = scope1.BeginLifetimeScope();
scope2.AttachTracer(new DefaultDiagnosticTracer());

@alistairjevans
Copy link
Member

Hmmm...

Perhaps instead of registering a tracer at each lifetime scope, you register a TracerFactory.

When we create a lifetime scope, and there is a TracerFactory configured, we call something like the following on the factory:

ITracer? CreateTracer(Lifetime scope scope, ITracer? parentTracer)

The tracer is disposable, and we dispose of it when the scope ends.

The factory can serve to group things together, decide when to start tracing, etc etc.

To collect all resolves for a given parent scope, we can pass data from tracers in a nested scope up to a parent scope. Perhaps use tagged scopes to decide the grouping level?

The model is a little bit like Logger/LogProvider I guess.

@tillig
Copy link
Member

tillig commented Jun 25, 2020

Maybe we need to think of it more like distributed tracing. If the notion of trace mirrored the notion of lifetime scope, it'd be like:

  • Each lifetime scope is effectively a "span" in trace.
  • Each resolve operation is a "span."
  • Possibly each part of a resolve operation is a "span."
  • Information gets passed from parent span to child span in a "span context."

This is really similar to the factory idea, but the relationship is like:

  • Tracer creates spans
  • Spans create child spans
  • At the end of the top-level span (lifetime scope?) the trace gets reported

@tillig
Copy link
Member

tillig commented Jun 25, 2020

Still got the problem of knowing when to start and stop a "root span" though. If you add something like that to the container it's going to be just one big span. So we'd still need the attach/detach events to allow start/end of a "root span" that can be reported on lifetime scope disposal.

@alistairjevans
Copy link
Member

When you configure the thing that starts and stops spans, could you specify a callback that takes a lifetime scope and says, "this is the root scope for the graph"? Call that when we enter a scope, and then when we dispose of the scope to know when to finish.
Could provide extension methods that gives a "root trace scope" for every child on the root container, or each with a particular tag, or something.

@alistairjevans
Copy link
Member

Or actually, just invoke the callback on entry to a scope, mark the scope, then we know what to do on exit.

I feel like it's ok to add fields to the lifetime scope to indicate "is root of trace".

@tillig
Copy link
Member

tillig commented Jun 25, 2020

Still noodling on this. Trying to enumerate the challenges and what possible solutions might be.

  • Locating the root of the trace/tree: I keep thinking the root should be the initial scope to which the tracer is attached. Which is to say, the primary use case would not be to attach it to the container unless you want to actually graph the whole container over the course of the app execution.
  • Knowing when to finish the trace/tree: Again, thinking "on dispose of the originating scope" which would be symmetrical to having the root be with the originating scope.
  • Correlating Func<T>, Owned<T>, and other relationships with the trace: I think this will automatically happen since the tracer gets pushed from parent to child scope once it's attached.
  • Handling manual resolves from the lifetime scope: Again, I think this will be automatically handled, though I think it'll look odd in the graph. The caller of the resolve operation isn't really known here.

And, of course, I'm not 100% sure what a "good graph" might look like. The graph/trace we're thinking of is about things getting resolved and their dependencies, which is in the purview of the DI container.

The more I think about trying to get a full trace of, say, an ASP.NET Core app where the controller shows up as a node in the graph, which then has a controller action execute which may resolve something... or that controller action may take the Func<T> and pass it to something else which then does the resolve... I'm not sure if that's a thing we can really address.

It's more than just async state and trying to track it across threads. Last I checked (and this may have changed, but I'm pretty sure it's still this way), unless you do AddControllersAsServices then controllers don't get resolved from the container, instead it's a foreach over the constructor parameters which then individually get resolved. This is the same way middleware Invoke methods work - it's an individual resolve op for each parameter.

What that means is the graph is going to be more about scopes, things resolved from those scopes, and dependencies. And that makes sense in the context of other application types, too.

@tillig
Copy link
Member

tillig commented Jul 2, 2020

Still working on this. I got VS Code on Mac working to build this, which is nice, but it's a little tricky. I'm updating the CONTRIBUTING.md to explain it.

First, you must use the Mono from the "Visual Studio channel." I had one installed from the main stable channel and it really doesn't work with multi-target projects like this. Homebrew uses that one, too, so... yeah, no.

I added references to Microsoft.NETFramework.ReferenceAssemblies as dev dependencies, which gets the Mac around the .NET 4.6.1 problem. You can actually build .NET 4.6.1 stuff.

What doesn't work is sort of interesting: Autofac.Test multi-targets netcoreapp3.0;net461. It references Autofac.Test.Scenarios.ScannedAssembly which targets only netstandard2.0 - the lowest common denominator for those test targets. I get errors in VS Code saying Assembly 'Autofac' with identity 'Autofac, Version=0.0.1.0, Culture=neutral, PublicKeyToken=17863af14b0044da' uses 'netstandard, Version=2.1.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51' which has a higher version than referenced assembly 'netstandard' with identity 'netstandard, Version=2.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51' [Autofac.Test.Scenarios.ScannedAssembly]. However, if you actually run the build, these errors don't appear. There's a disconnect between how Omnisharp is executing analyzers and behind-the-scenes Roslyn stuff and how the actual dotnet build process is being run. I'm guessing it wants netstandard2.1 because the test assembly is building the netcoreapp3.0 target, ends up referencing the netstandard2.1 version of Autofac, and then wonders why that scanned assembly isn't high enough.

Unclear how to solve that. For the moment, it can be ignored.

Looks like .NET Test Explorer doesn't quite work, likely, again, the unit tests targeting net461. But it builds! ;)

@tillig
Copy link
Member

tillig commented Jul 7, 2020

Working on this and, among other things, getting more .editorconfig in place to help since I'm in VS Code and don't get as much VS magic.

What are folks' thoughts on using this. for local references? I'm personally a fan because when you're in GitHub or... VS Code... or places where it's unclear if the thing being referenced is, like, a static method, an instance method, or what, having the this. makes it obvious and easy to read. However, I know the .NET corefx repo has that turned off because they think it's redundant.

It doesn't have to be a build warning, it can be a "suggestion" at the info level, like a quick refactor light would show up next to it if it's missing.

On or off?

@alsami
Copy link
Member

alsami commented Jul 7, 2020

Working on this and, among other things, getting more .editorconfig in place to help since I'm in VS Code and don't get as much VS magic.

Have you tried Rider? Isn't there also a VS version for Mac by now? Supporting .NET Core?

What are folks' thoughts on using this. for local references?

I'd personally be in favor of it. Just curious: would you also stop using underscores for fields or would you still being using them in combination with this?

@tillig
Copy link
Member

tillig commented Jul 7, 2020

I do a lot of different stuff other than C# and VS Code works great for me, I'm not really interested in switching IDEs right now.

I'd probably leave the _ for fields to ensure they're disambiguated from properties and methods.

@alistairjevans
Copy link
Member

My personal feeling is that I don't like this. if it isn't necessary. It adds a lot of verbosity, and it's just another thing to read. The _ is there for someone to know if it's a local field. Is having to check whether a method is instance or static that much a burden?

That said, I use Visual Studio almost exclusively, which makes it super easy to know which is which, so I may be biased.

@tillig
Copy link
Member

tillig commented Jul 7, 2020

I liked this. before, when trying to navigate stuff on GitHub without having to locally clone it and load it all up in an IDE... and now that I'm mostly working on a Mac in VS Code (and especially when it's, like, a project that uses .NET desktop framework where IntelliSense won't work because it won't compile) I've grown slightly more attached to it. Like, "Is this a local variable I don't see declared? Is it an instance field incorrectly named? Static field? I can't tell! I know, let me scroll up and down through this (looking at old RegistrationExtensions) 1600+ line long file to figure it out!"

But I don't want to make folks do stuff they feel strongly against, I'm mostly just concerned with consistency.

@johneking
Copy link

johneking commented Jul 7, 2020

Personally I'm not a fan of this. but it's an interesting point about readability outside of an IDE, I can see the value there. However, if there's no guarantee that those rules aren't being adhered to when looking at code via GitHub (incorrectly named instance field etc.), could it be a slippery slope to making other concessions with what might be the optimal formatting inside an IDE?

Along similar lines, I've found that most places where disambiguation between static and instance method calls becomes important tend to be where there's something else wrong, e.g. an instance method accidentally calling a static method with side-effects. So I guess the question for me comes down to whether the rules are tailored for making the finalized production code cleaner or making incomplete changes more readable, and I'd probably go with the former.

@tillig
Copy link
Member

tillig commented Jul 7, 2020

However, if there's no guarantee that those rules aren't being adhered to when looking at code via GitHub (incorrectly named instance field etc.), could it be a slippery slope to making other concessions with what might be the optimal formatting inside an IDE?

Well, if we wanted to do hard enforcement, we can set up StyleCop to ensure it's consistent in the entire repo, guaranteed or there's a build warning.

e.g. an instance method accidentally calling a static method with side-effects

Yes, exactly that. It's been pretty hard to do stuff like review pull requests in an efficient fashion (e.g., try to review in the GitHub UI without checking it out and being in VS) and trying to disambiguate stuff like...

private static readonly ILogger Logger = new Logger();

// Skip down 400 lines and then...

public void DoWork()
{
  // static thing being worked on in an
  // instance method, but... totally unclear
  if(Logger == null)
  {
    Logger = new Logger();
  }
}

It gets better when it's like

public void DoWork()
{
  // static thing being worked on in an
  // instance method, but... totally unclear
  if(Logger == null)
  {
    Logger = Logger.Logger();
  }
}

Which is admittedly a pretty contrived example, but is also not far off from stuff I've seen here where everything is called "LifetimeScope" or "Container." And it's a real gotcha when it's like, "Is this considering shared state correctly? Is it locking right? What is that thing?" My scroll bar gets a workout, and then I have to remember it all.

When you're on Windows in a full VS IDE, agreed, you can just hover the cursor over and call it good. If you are not in that specific environment, it's a lot harder. This is, admittedly, something I was not fully aware of until switching to a Mac for most of my dev. Mac VS is not super; there are problems with Mono, .NET Core, and OmniSharp having weird conflicts such that (depending on your setup) you may or may not get Intellisense; and so on... it just sort of adds up. I'll also blame @alistairjevans for submitting gigantor PRs to overhaul internals. 😆

It could be that I'm optimizing for my workflow over others, which is why I raised the question. It seems that not putting this. is pretty common outside my workflow, so I can set things up just like corefx and suggest removal of this..

@alistairjevans
Copy link
Member

Is it too late to split out the discussion/changes about style from the dot-graph & DiagnosticSource changes? I imagine it is, but thought I'd check.

@tillig
Copy link
Member

tillig commented Jul 10, 2020

Yeah, done with that. Sorry for derailing.

@alistairjevans - question about DefaultDiagnosticTracer so I can make sure I understand it and sorta follow the pattern on the DOT generator...

I see in the OperationFailure handler that if it's not a top level operation, then basically the info gets appended to the builder... but that builder never gets referenced again. Basically nested operations get pitched and not traced.

I know I added the bit with the "remove that tracer from the dictionary when it's done" because it never got touched again, but I'm wondering if there's a subtlety here that I'm missing - in effect, no nested operations report anything.

Is that the intention, or should nested operations somehow report up to the parent operation (if there is one)?

Related: If there is a parent operation, there's no way to get from a child to the parent to figure out what it is. ResolveOperationBase knows if it's a "top level operation" but doesn't know, if it's not the top level, what its respective parent is.

If we don't care, that's fine; tracking parent/child resolve operations seems like potential for memory leaks if not handled carefully. I just didn't know if I was missing something.

@alistairjevans
Copy link
Member

The StringBuilder we write to in a nested operation is a reference to the one in the _operationBuilders dictionary added by the top-level resolve, so the nested operation data is added to that builder, and the entire thing is rendered at the end.

I actually think that the concept of a 'top-level operation vs nested operation' is no longer necessary. That was needed while Decorators required a completely fresh ResolveOperation when they resolved due to the circular dependency fun. Since introducing the SegmentedStack, we may not need that concept at all, because everything happens in the same ResolveOperation.

The only place that currently sets IsTopLevelOperation = false is never called.

I'm going to raise a PR to take the concept of IsTopLevelOperation out of v6.

@alistairjevans
Copy link
Member

Equally, the 'ITracingIdentifier' isn't relevant anymore; again that was only needed when you could have nested resolve operations. I'll take that out too (so the key in the _operationBuilders dictionary is just a ResolveOperation).

@tillig
Copy link
Member

tillig commented Jul 13, 2020

I have some graphs working. I'm interested in what folks think of the general format. I borrowed a lot from the DefaultDiagnosticTracer, like, the data displayed in a graph.

Here's a success graph resolving a decorated service.

Success - resolved decorator

  • Middleware is in a "stack" of connected objects where a request calls into the first one.
  • Middleware can spawn requests. It renders a little oddly, like it wants to be "below" the whole stack of middleware. This changes if you have multiple middleware requests getting spawned; the default just seems to be "render top-to-bottom." I fought a lot with "ranking" of things to try to force it to render differently but it never came out well.
  • Success information is shown in the request - service, component, instance, decorator target.

Here's a failure graph where there's an error in the lambda that produces the component.

Error - lambda generated an exception

  • The exception for the operation shows up at the operation level.
  • Request level exceptions show up at the request level.
  • The failure path is highlighted in bold - bold arrows point to the failure, and the stack of middleware will be bold up until the failure. (In this case, the innermost middleware is where the failure occurred, so the whole middleware stack is bold.)

I tried so hard to get the request level exception to be left aligned but it just wouldn't happen. The graph is rendered using HTML table markup to ensure service, component, and other fields get properly separated... but the table alignment gets ignored if there's a <br/> tag inside the cell.

I feel like the error graph is noisy. Like, the top-level operation has the same message in it that the request has; it'll just be stack trace upon stack trace getting up to the top level. Would it be better to have just the message of the exception in the resolve request and the full exception only shows up in the operation?

Also, I didn't use color due to color blindness in folks; I felt bold would be more accessible. However, I don't know if it's differentiated enough.

Suggestions?

@tillig
Copy link
Member

tillig commented Jul 13, 2020

I have the DOT tracer in a reasonable form, I think, with some tests around it and generating the above graphs. I'll hold on to see if folks want to change the graphs for a day or so and submit the PR if I don't hear anything. Either way it shouldn't be that hard to iterate over if we want to fix it later.

@alistairjevans
Copy link
Member

Looking at the graphs, I have a couple of thoughts (although not sure what could be done about them).

  • The output of these graphs appear to be diagnostic, at the same granularity as the existing default diagnostic tracer. Is that what we want from a generated graph, as opposed to just 'a graph of resolved components', showing all the dependencies in your component graph? Would someone use this graph to investigate problems, or are they just looking for an overview of the dependency relationships?

  • The 'stack' of graphing items does seem to imply that the entirety of the 'nested' graph executes before any of the activator behaviour, which is wrong, the Implementor activator executes before the decorator stage.

  • We don't need all that exeception data in the graph, they will still get that exception propagated to them through logs. I'd argue that the stack details are definitely not worth displaying at the request level, and possibly not even at the operation level?

I think my gut feeling is that this graph data may be too granular, and we need to agree on what data we are looking to be visualised in a graph, and what the graph is used for. I'm not sure that the graph should be another form of diagnostic output.

@tillig
Copy link
Member

tillig commented Jul 14, 2020

All good questions/points. I got the graph this far and was like... 🤔 I'm not sure this is quite what we're going for. But I admit I got caught up in getting anything to render reasonably in DOT format. It's far less friendly than PlantUML and much less flexible. But it's common, so... okay.

Stepping back, I think:

  • Seeing the dependency tree is likely a better goal than diagnostics.
    • It is interesting to know which component, if any, in the resolve graph fails.
    • It is not interesting to see the full stack trace/details. The value is in visualization of where in the graph it is, not in the diagnostic aspects.
    • It needs to be far less granular or large resolves will be unreadable. Seeing the middleware stack is probably too granular. Resolve requests is probably the right level.
  • These are simple cases; I should do testing with a more complex resolve scenario to get a more representative graph.
  • Better visualization of the types would be nice. For example, if there's a decorator wrapped around something, having it show as a "type inside a type" like an onion shape might be nice. However, there's no way I can tell to have a "node inside a node" like that in DOT language. You can have "subgraphs" but that's not the same. PlantUML is far more expressive in this respect.
  • I think representing lifetime somehow would be interesting but I'm not sure how easy or practical that would be. Depending on the representation, it could also make the graph hard to read. Might be good as a later enhancement rather than first run at the graph component.

Given that, let me craft something up by hand and propose a new graph format prior to rewriting the tracer.

@tillig
Copy link
Member

tillig commented Jul 14, 2020

This might be better.

A simple "success" graph might have code like this...

digraph G {
    label="Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+IService"
    labelloc=t
    n1 [shape=component,label=<
    <table border='0' cellborder='0' cellspacing='0'>
    <tr><td>Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+IService</td></tr>
    <tr><td><font point-size="10">
    Component: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Implementor<br/>
    Instance: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Decorator
    </font></td></tr>
    </table>
    >];
    n2 [shape=box3d,label=<
    <table border='0' cellborder='0' cellspacing='0'>
    <tr><td>Decorator (Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+IService)</td></tr>
    <tr><td><font point-size="10">
    Component: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Decorator<br/>
    Instance: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Decorator<br/>
    Target: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Implementor
    </font></td></tr>
    </table>
    >];
    n1 -> n2;
}

...and render like this:

Proposed success graph

In this case, there's no middleware displayed and the graph is less about looking like a "resolve pipeline" than it is about trying to visually represent the components. Standard (non decorator) services are shown as components; decorators are 3D boxes to indicate they can "contain stuff." I couldn't figure out how to get one nested in the other, and that's not really how the events get exposed to the diagnostics side of things anyway.

The title on the graph is based on the service being resolved. In a tiny graph like this it looks redundant, but once the graph gets bigger/more complex, having that title at the top will probably be helpful to know what exactly you're looking at.

A failure graph is similarly simplified.

digraph G {
    label="Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+IService"
    labelloc=t
    n1 [shape=component,label=<
    <table border='0' cellborder='0' cellspacing='0'>
    <tr><td>Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+IService</td></tr>
    <tr><td><font point-size="10">
    Component: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Implementor<br/>
    Instance: Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+Decorator
    </font></td></tr>
    </table>
    >];
    n2 [shape=box3d,penwidth=3,color=red;label=<
    <table border='0' cellborder='0' cellspacing='0'>
    <tr><td>Decorator (Autofac.Specification.Test.Diagnostics.DotDiagnosticTracerTests+IService)</td></tr>
    <tr><td><font point-size="10"><b>
    Exception: Operation is not valid due to the current state of the object.
    </b></font></td></tr>
    </table>
    >];
    n1 -> n2 [penwidth=3;color=red];
}

Which renders as:

Proposed failure graph

The arrows and components with errors get bold and red and only the exception message is shown, not just the stack trace.

I mean, I can't guarantee it'd be literally exactly like this, but it could be pretty close, and I think this is maybe more useful than the previous more granular graph. What do you think?

@alistairjevans
Copy link
Member

That's certainly closer to the level of detail that I'd expect from a graph of 'show me my dependencies'. It would be interesting to see how it looks on a more complex graph.

Dot looks like kind of a nasty output file format.

@tillig
Copy link
Member

tillig commented Jul 14, 2020

Let me see about overhauling the output to be closer to this level of detail and I'll generate a more complex graph. I'm certainly not writing it by hand. DOT is a pain. 😆

@tillig
Copy link
Member

tillig commented Jul 15, 2020

I updated the graph format and it's way better. Here's a graph of the resolution from the Autofac.Specification.Test.Resolution.Graph1 namespace, the types associated with the ComplexGraphTests.

Complex graph success path

DOT graph code for those interested
digraph G {
label=<Autofac.Specification.Test.Resolution.Graph1.E1>;
labelloc=t
n4164d78f7f29424fa134864898cf5fa0 [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.E1</td></tr>
<tr><td><font point-size="10">Component: Autofac.Specification.Test.Resolution.Graph1.E1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.E1</font></td></tr>
</table>
>];
ne93c439d58044d76b1e58fcda24e340e [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.B1</td></tr>
<tr><td><font point-size="10">Component: λ:Autofac.Specification.Test.Resolution.Graph1.B1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.B1</font></td></tr>
</table>
>];
n33538e4a88834bfa83aad1c808497e91 [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.A1</td></tr>
<tr><td><font point-size="10">Component: Autofac.Specification.Test.Resolution.Graph1.A1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.A1</font></td></tr>
</table>
>];
ne93c439d58044d76b1e58fcda24e340e -> n33538e4a88834bfa83aad1c808497e91
n4164d78f7f29424fa134864898cf5fa0 -> ne93c439d58044d76b1e58fcda24e340e
ne35649e90832411bb61de2e2dfbd494a [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.IC1</td></tr>
<tr><td><font point-size="10">Component: Autofac.Specification.Test.Resolution.Graph1.CD1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.CD1</font></td></tr>
</table>
>];
nd3b0667b363a47c09c0f2b2e2588dec7 [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.A1</td></tr>
<tr><td><font point-size="10">Component: Autofac.Specification.Test.Resolution.Graph1.A1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.A1</font></td></tr>
</table>
>];
ne35649e90832411bb61de2e2dfbd494a -> nd3b0667b363a47c09c0f2b2e2588dec7
nb097ab6a72b84a70a8c7f1afac8960c6 [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.B1</td></tr>
<tr><td><font point-size="10">Component: λ:Autofac.Specification.Test.Resolution.Graph1.B1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.B1</font></td></tr>
</table>
>];
n3ee82d2c9f744147972872a47bf4223a [shape=component,label=<
<table border='0' cellborder='0' cellspacing='0'>
<tr><td>Autofac.Specification.Test.Resolution.Graph1.A1</td></tr>
<tr><td><font point-size="10">Component: Autofac.Specification.Test.Resolution.Graph1.A1</font></td></tr>
<tr><td><font point-size="10">Instance: Autofac.Specification.Test.Resolution.Graph1.A1</font></td></tr>
</table>
>];
nb097ab6a72b84a70a8c7f1afac8960c6 -> n3ee82d2c9f744147972872a47bf4223a
ne35649e90832411bb61de2e2dfbd494a -> nb097ab6a72b84a70a8c7f1afac8960c6
n4164d78f7f29424fa134864898cf5fa0 -> ne35649e90832411bb61de2e2dfbd494a
}

A failure graph is similarly more interesting. Here I swapped the A1 registration for a throw new InvalidOperationException() lambda.

Complex graph failure path

More DOT code... ```dot digraph G { label=<Autofac.Specification.Test.Resolution.Graph1.E1>; labelloc=t n79692a7abaa14387813eddea083a3e6b [shape=component,penwidth=3,color=red,label=<
Autofac.Specification.Test.Resolution.Graph1.E1
Component: Autofac.Specification.Test.Resolution.Graph1.E1
>]; ncca50442edcd4584870ab6fdce523e92 [shape=component,penwidth=3,color=red,label=<
Autofac.Specification.Test.Resolution.Graph1.B1
Component: Autofac.Specification.Test.Resolution.Graph1.B1
>]; nb6801f03189b4950848ca1d226c08bba [shape=component,penwidth=3,color=red,label=<
Autofac.Specification.Test.Resolution.Graph1.A1
Component: λ:Autofac.Specification.Test.Resolution.Graph1.A1
System.InvalidOperationException:
Something bad happened.
>]; ncca50442edcd4584870ab6fdce523e92 -> nb6801f03189b4950848ca1d226c08bba [penwidth=3,color=red] n79692a7abaa14387813eddea083a3e6b -> ncca50442edcd4584870ab6fdce523e92 [penwidth=3,color=red] } ```

In doing this I realized that the actual operation is only interesting for the graph title. Really what the graph cares about at this point is the initiating request and the children below that. Means there are fewer events this particular diagnostics mechanism requires (no middleware events, for example).

What do we think about that? I like it a lot better.

@alistairjevans
Copy link
Member

Yeah, that looks a lot better!

One thought; each resolve of the A1 service will result in a separate box, which kind of makes sense in a 'everything is transient' world, but shared lifetime registrations would also have separate boxes.

Any way to have it so there's one box per registration rather than resolve, and multiple arrows into it? That would help visualise dependency chains and the like?

@tillig
Copy link
Member

tillig commented Jul 15, 2020

I was thinking about looking at the ResolveRequestContextBase.ActivationScope, storing the scope tag along with resolution/failure data, and building the graph that way, but it occurs to me it's less about lifetime scope and more about "is this instance the same object as that instance." I'll probably instead keep references to the resolved object instances to determine if they're the same object that got resolved or a different one. That shouldn't be a problem as long as the graph builder with the references gets released at the end of an operation, just something I'm thinking about.

That also, then, means the connections between nodes aren't just parent-child resolve requests, now it's gotta also figure out how to de-dupe the requests such that two requests resulting in the same response get merged to become the same node in the graph. It becomes less a tree of requests and more a general directed graph.

This is all very different than my more simple recurse-down-the-tree mechanism so I'll need to overhaul the way the graph is built. Not impossible, but not today. :)

I'm not sure how I'd represent something like this:

public class A
{
  public A(IService1 service, B b) { }
}
public class B
{
  public B(IService2 service) { }
}
public class C : IService1, IService2
{
}

var builder = new ContainerBuilder();
builder.RegisterType<C>().As<IService1, IService2>().SingleInstance();
builder.RegisterType<B>();
builder.RegisterType<A>();
var container = builder.Build();
builder.Resolve<A>();

In this case, since C represents multiple services but only has a single instance the graph would probably look like...

  A
 / \
C - B

However, in this case the C box needs to list two different services whilst still representing only one instance.

Would that be confusing? Knowing which service/interface my thing wanted that resulted in the shared instance seems like it'd be valuable to have in the graph, but if there are different ones it becomes less clear.

I could switch the graph so the label for the resolved service is on the connector line, that might disambiguate it a bit, but a quick test shows that to be pretty messy.

Graph with line labels

I could make it so the services are each listed in the shared component and then the arrows point from the source request to the service it was targeting. That also seems like it could be messy.

Graph with multiple services per component

Still confusing?

I haven't actually coded any of that up yet.

Note: I updated that second graph because I just noticed I hadn't actually rectified all the A1 references. It cleans up the lines but also makes it less obvious that the services being resolved are different based on where the arrow ends up. I Left the two B1 to show we could have multiple instances of the same thing. It's hand-hacked, it's not going to be exact... but it should give an idea.

@alistairjevans
Copy link
Member

I actually quite like the services being an attribute of the line, because that kind of represents the operation, like "I'm requesting this, and ended up here", as an edge in a graph, r With shorter class names (could we optionally omit namespaces?) it would look pretty good I reckon.

That last example graph looks really useful, because it shows a glace that everything is dependent on A1.

I'm not sure whether instances is more useful than registrations...is the aim to show object dependency, or registration dependency?

Object dependency might tell you you're allocating too many new instances of a transient dependency I suppose, but registration dependency shows you graph complexity more concisely.

As you say, this becomes more of a directed graph than a tree, but I can imagine a dictionary of Registrations as nodes, with a set of edges indicating requested services.

@tillig
Copy link
Member

tillig commented Jul 15, 2020

I think the instance dependency graph might actually be more interesting because when it comes to stuff like InstancePerLifetimeScope with nested lifetime scopes, the same registration may yield different instances but the graph would be slightly misleading in that it would appear everything depends on the same instance rather than the same registration. I think. (?)

I dunno, if I can get either one to work I'll be happy 😄

@tillig
Copy link
Member

tillig commented Jul 16, 2020

Here we go!

Complex resolution chain normalized by instances

I have each node representing an object instance (so you can see if you're allocating a bunch, or to differentiate between different registrations in different lifetime scopes, or whatever). If a specific component exposes more than one service, it'll show up in the graph attached to the same logical instance. For example, in this graph, you see E1 requires an IC1 and an ID1 and both dependencies get satisfied by the same component.

I kept the lines labeled, but it also points to the correct service in the node, so... both things.

Errors also come through, though the whole dependency chain isn't shown, just the errors.

Error graph

I found that long error messages screwed up the display pretty badly so I have it line wrap any error messages at 40 characters. It's arbitrary but keeps the graph readable.

I didn't post the DOT code because the use of HTML labels on things isn't simpatico with the Markdown parser for DOT code here. It doesn't matter; it's ugly and you don't want to see it anyway.

The line labels are long because it's the Service.Description which is really all I have to go on. The base Service class only has Description and equality comparison.

What do we think? Better? More suggestions?

@tillig
Copy link
Member

tillig commented Jul 16, 2020

Unrelated - I seem to have been riding the differences between netstandard2.0 and netstandard2.1 this whole process. I'm using a KeyedCollection<K,V> to store the list of graph request nodes and wanted to use KeyedCollection<K,V>.TryGetValue() but... that's only netstandard2.1. I have a string.Replace operation to fix up newlines in the graph and the analyzers want it to be string.Replace(toReplace, replacement, StringComparison)... but that overload with the StringComparison wasn't added until netstandard2.1. 😦

@alistairjevans
Copy link
Member

That looks great!

A couple of musings:

  • I don't think you need the services listed inside the node as well, it feels like it makes it a little more confusing without adding much extra value.

  • I think the error reporting is fine; if people need more granularity for errors, there's other ways to see that.

  • If you check if the service is an IServiceWithType (the overwhelming majority of cases), you can special-case it to get the type, and a shorter name.

  • Perhaps only show the Instance if the type is not equal to the registrations limit type? That means the instance adds value in some situations (like decorators), without complicating the default case. You also wouldn't need the 'Component' label in the simple case then, meaning the node only has to contain the limit type.

@tillig
Copy link
Member

tillig commented Jul 16, 2020

I'll see what I can do about IServiceWithType. Good call there, the shorter line labels will help. I think if we go with shorter line labels, though, the aggregate list of services in the node is helpful. "Oh, this instance is supporting these X things" very easily visible without having to back-trace all the lines and create a mental model of that. Plus... I've been in too many systems where there are 100 different IValidator interfaces that are only differentiated by namespace. Knowing which IValidator will be valuable.

I'll also see about skipping the instance if the instance type is the same as the service type and that instance is only being used to fulfill the one service. If it's fulfilling multiple services, disambiguating by specifying the instance type, even if it's the same as one of the services, seems like it'd be valuable to me.

@alistairjevans
Copy link
Member

True, duplicate type names are a thing.

I mean, this might be a case of "put it out there and seek feedback".

Just a thought, since we may want to iterate the graph renderer more rapidly than Autofac itself, is it worth spinning the DOT rendering code (and just that) into another package?

With the DiagnosticSource changes in core, we should be able to hang this sort of stuff outside, right?

@tillig
Copy link
Member

tillig commented Jul 16, 2020

Yeah, we could put it in a separate package. I'm not sure I want to expose the list of string IDs for the list of events as a public part of the API (they also don't expose that from ASP.NET Core) so we'd have a little duplicate code there with strings but I think that's OK. I'm not sure I want to forever be the DOT graph guy, though, so after this is done if folks want to iterate it's going to be pull requests, which... I'm guessing there's not much that'll happen there.

But I can see that perhaps other tracers aren't really "core features" so an external package makes sense.

Let me wrap up this last bit of feedback and then I'll look at making it a separate repo. If there were any core changes required to the diagnostics area to make this happen, I'll PR that portion of things here.

@alistairjevans
Copy link
Member

If people are keen on improving/adjusting the graph tooling, hopefully they'll contribute? 🤞

You can get your own back with a chunky PR for me to review! I definitely owe you a few reviews by this point...

@tillig
Copy link
Member

tillig commented Jul 16, 2020

I think I got it.

Reduced redundancy graph

  • Simple type names for the services on the line labels.
  • Only writing the instance type IF:
    • There are multiple services associated with a single instance OR
    • There's only one service used for that instance and the service type doesn't match the instance type.

In this case, the IC1 service is used and backed by a CD1 instance, so you see that written out, but the others don't get that info because the service they expose matches the implementation type.

I'm pretty happy with this. I think it balances the amount of information in the graph with readability - it has enough to disambiguate type names but not be too overwhelming. You can pretty easily follow the dependency chain, see where things are shared instances, and (via the label at the top) see what the thing was resolving in the event it's not already obvious. DOT/GraphViz does seem to do a reasonable job of sticking the "root" at the top automatically.

Unless there's something sticking out like a sore thumb here, I think I'll look at creating a separate repo for this thing and getting a PR ready to go. Then I can roll the tracer-specific stuff out and get a PR into this core repo for anything else.

What sounds like a good name? Autofac.Diagnostics.DotGraph? I don't want to use DotTrace anywhere because that's already a JetBrains product. 😄

@alistairjevans
Copy link
Member

I agree, I think you got it! That looks great.

Diagnostics.DotGraph sounds good to me. We can hang any other custom diagnostics under the same namespace.

So many new features! v6 is going to be great. 😀

@tillig
Copy link
Member

tillig commented Jul 17, 2020

I'll get started on that.

Should we put the other diagnostics things in an Autofac.Diagnostics namespace (instead of an Autofac.Core.Diagnostics namespace) to match? I can do that in this PR. I lean toward that for consistency.

@tillig
Copy link
Member

tillig commented Jul 17, 2020

Yeah, moving things to Autofac.Diagnostics. That'll be in the PR. There was one obsolete interface in Autofac.Core.Diagnostics and that's getting removed.

@alistairjevans
Copy link
Member

Yes, sorry for not responding before, makes sense to use the same namespace.

@tillig
Copy link
Member

tillig commented Aug 7, 2020

Initial commit of the Autofac.Diagnostics.DotGraph support is complete. I'm going to close this issue, since we have the feature running and it'll be released when v6 is released. There are a couple of additional enhancements noted as issues over there and that will be in the works, too.

@tillig
Copy link
Member

tillig commented Sep 24, 2020

This package is ready to go when v6 releases. I'm going to close this issue since there's no more to be done here.

@tillig tillig closed this as completed Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants