Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add input information to fusion definitions for trace inspection and debugging #388

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

riccardofelluga
Copy link
Collaborator

Before submitting
  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • [x ] Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Fixes #387.
This PR adds information about the inputs for a fusion definition such that it can be retrieved by inspecting the trace. A tutorial on how to read this information will be published as part of #205.

Also this PR is in preparation for #205

Quickly, from a trace:

trace = thunder.last_traces(fn)[-1]
trace_ctx = trace.python_ctx()
print(trace_ctx['nvFusion0'].last_inputs())

will print something like:

inputs = [
    torch.randn((2048,), dtype=torch.float32, device='cuda:0').as_strided((1, 2048), (2048, 1)),
    torch.randn((4096,), dtype=torch.bfloat16, device='cuda:0').as_strided((1, 2048, 4096), (4096, 0, 1))
]

I'm open to change the str output to returning a list of tensors, however for debugging it's usually enough to have a string.

Copy link
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really interesting idea and it would be a very useful addition to query the last used inputs for a specific fusion region. Since this an nvFuser executor specific change I don't think Thunder is the right place to put these queries.

How would you generalize it to all other execution functions and regions?

Thunder allows getting the nvfuser.FusionDefinition object with execution_trace.python_ctx()["nvFusion0"].last_used and maybe this particular code should be a method of nvfuser.FusionDefinition?


def __call__(self, *args):
fd = self.get_fd(to_descriptors(args))
self.last_used = fd
self.last_inputs_meta = [(i.size(), i.stride(), i.dtype, i.device) for i in args if isinstance(i, torch.Tensor)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the average size of args in GPT models? How much overhead does this line add to the execution?

@@ -413,6 +415,31 @@ def __call__(self, *args):
def __repr__(self):
return f"FusionDefinitionWrapper({self.name})"

def last_inputs(self) -> str:
Copy link
Collaborator

@mruberry mruberry May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus @kevinstephano to consider if nvfuser itself would like to expose something like this. I think we discussed this briefly before, but maybe we could query nvFuser's Python API for something like "last reproduction" and that would include this information? Maybe the reproduction could be structured so it could be a Python string + contain structured information about the inputs?

For last_inputs here, I think we should consider just saving non-tensor inputs and only saving the metadata of tensor inputs. Non-tensor inputs can only be numbers or sequences of numbers, right? It'd be nice if last_inputs would return a valid set of inputs to the fusion, even creating the requested tensors based on the metadata. We could also rename this function to something like print_last_inputs.

I'm OK with this logic being in Thunder for now if it's helpful, even if we decide to expose it through nvfuser. When the nvfuser exposure is available we can remove this. Let's just add a warning when it's used that it's experimental and may change in the future.

How does that sound, @riccardofelluga? @IvanYashchuk? @kevinstephano?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add more nvFuser debug information
3 participants