Add trace functionality to the function to_torchscript #4142

NumesSanguis · 2020-10-14T10:11:56Z

What does this PR do?

Add the ability to also choose to make use of TorchScript's trace() method, besides the default script()

(method) Adds the parameters method and example_inputs to support both modes. See the Issue for the rational. Default values assure that with no arguments provided, the original behaviour is kept.
(docs) Rewrites function description to match the extended capability
(tests) Adds a test case for the trace() function

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Yes :)

Notes

The documentation does not state how to run the tests locally. I did however test the new functionality in my own project.
The CHANGELOG did not have an entry yet for v1.1, so to prevent conflicts, I have not updated this yet.

pep8speaks · 2020-10-14T10:12:01Z

Hello @NumesSanguis! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-10-14 10:18:36 UTC

justusschock

Looks fine to me. Could you maybe also extend the example in docstrings?

codecov · 2020-10-14T12:48:48Z

Codecov Report

Merging #4142 into master will decrease coverage by 0%.
The diff coverage is 89%.

@@          Coverage Diff           @@
##           master   #4142   +/-   ##
======================================
- Coverage      92%     92%   -0%     
======================================
  Files         103     103           
  Lines        7792    7798    +6     
======================================
+ Hits         7147    7152    +5     
- Misses        645     646    +1

awaelchli · 2020-10-14T13:37:03Z

pytorch_lightning/core/lightning.py

+                if example_inputs is None:
+                    example_inputs = self.example_input_array
+                # automatically send example inputs to the right device and use trace
+                torchscript_module = torch.jit.trace(func=self.eval(), example_inputs=example_inputs.to(self.device),


Here you assume that example_input_array is a tensor, but this is not true.
If forward takes *args, example_input_array is a tuple, and if forward takes **kwargs, example_input_array must be a dict.

Given your comment on the issue that .trace() accepts either a tuple or a torch.Tensor (that is automatically converted to a tuple), it means that the input should be: example_input_array: Optional[Union[torch.Tensor, Tuple[torch.Tensor]]]?

However, when the forward function accepts **kwargs, self.example_input_array could be a dict, in which case .trace(example_inputs=example_inputs) will fail?

What would be the best way to approach this? Does this mean that .trace() cannot be used if forward expects a dict?

import torch import torch.nn as nn class Net(nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(1, 1, 3) def forward(self, x): return self.conv(x) class Net2(nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(1, 1, 3) def forward(self, x, y): return self.conv(x) class Net3(nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(1, 1, 3) def forward(self, x, y): return self.conv(x) # SINGLE INPUT net = Net() ex_inp = torch.rand(1, 1, 3, 3) torch.jit.trace(net, ex_inp) # TWO INPUTS net = Net2() torch.jit.trace(net, (ex_inp, ex_inp)) # DICT (**kwargs) # fails # net = Net3() # torch.jit.trace(net, dict(x=ex_inp, y=ex_inp))

Here is an example. tracing supports single input and tuple, which gets unrolled to multiple positional args. In these two cases, you can use the Lightning self.example_input_array. However, dicts will not be passed as kwargs, and instead as a single input. In Lightning however, a dict would mean **kwargs.

I see several ways to handle it:

leave as is, user needs to know how self.example_input_array works

error when self.example_input_array is a dict

do not even use self.example_input_array, and require the user to give inputs to the method directly

Then there is a second issue. You should use the pytorch_lightning.utilities.apply_func.move_data_to_device to move the example input to the device, since it could be a tuple.

cc @ananthsub

1 & 2 could be combined by raising a warning instead of an error. From PL's side throw a warning similar to:

self.example_input_array cannot be a dict. Please provide a sample Tensor/Tuple to example_inputs as argument, or set self.example_input_array to a Tensor/Tuple.

Then output the actual error produced by .trace().
If in the future .trace() would be updated to support a dict, there is no need for a change (except removing the warning) on PL's side.

Personally, PL is for me about removing boilerplate code. Since self.example_input_array is already a thing in PL, it's better to use it. Therefore, I would advise against option 3.
I haven't used self.example_input_array personally yet, but in how many projects would this be a dict?

yes, makes sense.
Would you like to follow up on this with a PR? Would greatly appreciate this. For me the main concern is to properly move the input to the device with the function I referenced. For the way inputs are passed in, I don't have a strong opinon.

IDK how much future support there will be for tracing vs scripting (scripting is strongly recommended). Rather than adding more trace support at the top-level of the PL module, why not override to_torchscript in your lightning module to determine how you want to export? then you have way more flexibility with tracing

@awaelchli Ok, I'll follow up with another pull request using the move_data_to_device function.

@ananthsub edit moved my comment to the feature request, as it is a more relevant place for this discussion: #4140

@awaelchli I addressed your issues in a follow-up pull request (could not be added to this one due to it already being merged):
#4360

NumesSanguis · 2020-10-26T05:48:04Z

Follow-up pull request can be found here: #4360

NumesSanguis added 2 commits October 14, 2020 18:55

Add trace functionality to the function to_torchscript

fb595fe

used wrong parameter name in test

7a539d1

mergify bot requested a review from a team October 14, 2020 10:12

fix indentation to confirm to code style

641cc1b

justusschock approved these changes Oct 14, 2020

View reviewed changes

mergify bot requested a review from a team October 14, 2020 12:00

williamFalcon merged commit fa737a5 into Lightning-AI:master Oct 14, 2020

awaelchli reviewed Oct 14, 2020

View reviewed changes

mergify bot requested a review from a team October 14, 2020 13:37

NumesSanguis mentioned this pull request Oct 15, 2020

Expand to_torchscript to support also TorchScript's trace method #4140

Closed

Borda added this to the 1.0.x milestone Oct 20, 2020

NumesSanguis mentioned this pull request Oct 26, 2020

move example inputs to correct device when tracing module #4360

Merged

NumesSanguis mentioned this pull request Nov 6, 2020

update changelog after 1.0.5 #4505

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add trace functionality to the function to_torchscript #4142

Add trace functionality to the function to_torchscript #4142

NumesSanguis commented Oct 14, 2020 •

edited by Borda

pep8speaks commented Oct 14, 2020 •

edited

justusschock left a comment

codecov bot commented Oct 14, 2020

awaelchli Oct 14, 2020

NumesSanguis Oct 15, 2020 •

edited

awaelchli Oct 17, 2020 •

edited

awaelchli Oct 17, 2020

NumesSanguis Oct 19, 2020

awaelchli Oct 24, 2020

ananthsub Oct 24, 2020

NumesSanguis Oct 26, 2020 •

edited

NumesSanguis Oct 26, 2020

NumesSanguis commented Oct 26, 2020

Add trace functionality to the function to_torchscript #4142

Add trace functionality to the function to_torchscript #4142

Conversation

NumesSanguis commented Oct 14, 2020 • edited by Borda

What does this PR do?

Before submitting

PR review

Did you have fun?

Notes

pep8speaks commented Oct 14, 2020 • edited

Comment last updated at 2020-10-14 10:18:36 UTC

justusschock left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 14, 2020

Codecov Report

awaelchli Oct 14, 2020

Choose a reason for hiding this comment

NumesSanguis Oct 15, 2020 • edited

Choose a reason for hiding this comment

awaelchli Oct 17, 2020 • edited

Choose a reason for hiding this comment

awaelchli Oct 17, 2020

Choose a reason for hiding this comment

NumesSanguis Oct 19, 2020

Choose a reason for hiding this comment

awaelchli Oct 24, 2020

Choose a reason for hiding this comment

ananthsub Oct 24, 2020

Choose a reason for hiding this comment

NumesSanguis Oct 26, 2020 • edited

Choose a reason for hiding this comment

NumesSanguis Oct 26, 2020

Choose a reason for hiding this comment

NumesSanguis commented Oct 26, 2020

NumesSanguis commented Oct 14, 2020 •

edited by Borda

pep8speaks commented Oct 14, 2020 •

edited

NumesSanguis Oct 15, 2020 •

edited

awaelchli Oct 17, 2020 •

edited

NumesSanguis Oct 26, 2020 •

edited