New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add input information to fusion definitions for trace inspection and debugging #388
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really interesting idea and it would be a very useful addition to query the last used inputs for a specific fusion region. Since this an nvFuser executor specific change I don't think Thunder is the right place to put these queries.
How would you generalize it to all other execution functions and regions?
Thunder allows getting the nvfuser.FusionDefinition
object with execution_trace.python_ctx()["nvFusion0"].last_used
and maybe this particular code should be a method of nvfuser.FusionDefinition
?
|
||
def __call__(self, *args): | ||
fd = self.get_fd(to_descriptors(args)) | ||
self.last_used = fd | ||
self.last_inputs_meta = [(i.size(), i.stride(), i.dtype, i.device) for i in args if isinstance(i, torch.Tensor)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the average size of args in GPT models? How much overhead does this line add to the execution?
@@ -413,6 +415,31 @@ def __call__(self, *args): | |||
def __repr__(self): | |||
return f"FusionDefinitionWrapper({self.name})" | |||
|
|||
def last_inputs(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plus @kevinstephano to consider if nvfuser itself would like to expose something like this. I think we discussed this briefly before, but maybe we could query nvFuser's Python API for something like "last reproduction" and that would include this information? Maybe the reproduction could be structured so it could be a Python string + contain structured information about the inputs?
For last_inputs
here, I think we should consider just saving non-tensor inputs and only saving the metadata of tensor inputs. Non-tensor inputs can only be numbers or sequences of numbers, right? It'd be nice if last_inputs would return a valid set of inputs to the fusion, even creating the requested tensors based on the metadata. We could also rename this function to something like print_last_inputs
.
I'm OK with this logic being in Thunder for now if it's helpful, even if we decide to expose it through nvfuser. When the nvfuser exposure is available we can remove this. Let's just add a warning when it's used that it's experimental and may change in the future.
How does that sound, @riccardofelluga? @IvanYashchuk? @kevinstephano?
Before submitting
What does this PR do?
Fixes #387.
This PR adds information about the inputs for a fusion definition such that it can be retrieved by inspecting the trace. A tutorial on how to read this information will be published as part of #205.
Also this PR is in preparation for #205
Quickly, from a trace:
will print something like:
I'm open to change the
str
output to returning a list of tensors, however for debugging it's usually enough to have a string.