Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNX] Reland: Update training state logic to support ScriptedModule #86745

Closed

Conversation

justinchuby
Copy link
Collaborator

@justinchuby justinchuby commented Oct 11, 2022

In #86325, it was reported that ScriptedModule do not have a training attribute and will fail export because we don't expect them as input.

Also

  • Parameterized the test_util_funs test

Thanks @borisfom for the suggestion!

Fixes #86325

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 11, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/86745

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9b44054:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Oct 11, 2022
@justinchuby justinchuby added the module: onnx Related to torch.onnx label Oct 11, 2022
@justinchuby justinchuby mentioned this pull request Oct 11, 2022
28 tasks
@justinchuby justinchuby added this to the 1.13.0 milestone Oct 11, 2022
@justinchuby justinchuby added the topic: bug fixes topic category label Oct 11, 2022
@samdow samdow added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 12, 2022
Copy link
Collaborator

@titaiwangms titaiwangms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think a frozen test case should be added?

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 13, 2022
Copy link
Collaborator

@BowenBao BowenBao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @justinchuby please add test case as TiTai suggested.

For my understanding by looking at related code, is it that training is not possible for ScriptModules?

@justinchuby
Copy link
Collaborator Author

Thanks for reviewing! Tests added.

@justinchuby
Copy link
Collaborator Author

@pytorchbot merge -g

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: The following mandatory check(s) failed (Rule ONNX exporter):

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

@justinchuby
Copy link
Collaborator Author

@pytorchbot merge -g

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: The following mandatory check(s) failed (Rule ONNX exporter):

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

@justinchuby
Copy link
Collaborator Author

@pytorchbot merge -f "Unrelated functorch errors; onnx tests passed"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@justinchuby justinchuby deleted the justinchu/scripted-module branch October 14, 2022 01:52
@justinchuby
Copy link
Collaborator Author

For my understanding by looking at related code, is it that training is not possible for ScriptModules?

That is my current understanding

@janeyx99
Copy link
Contributor

janeyx99 commented Oct 14, 2022

@pytorchbot revert -m " https://hud.pytorch.org/pytorch/pytorch/commit/960b98128e475b15b66119f325232039799852cd broke onnx tests on trunk " -c weird

I could not interpret the failing logs on my phone, but this PR definitely introduced these regressions.

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@justinchuby your PR has been successfully reverted.

@justinchuby
Copy link
Collaborator Author

@janeyx99 thanks for catching this. The failures are legit. Somehow they were green when I merged

@justinchuby justinchuby reopened this Oct 14, 2022
justinchuby and others added 6 commits October 14, 2022 16:02
In pytorch#86325, it was reported that ScriptedModule do not have a training attribute and will fail export because we don't expect them as input. We should also relax the type constraints in a following PR.

Fixes pytorch#86325
@justinchuby
Copy link
Collaborator Author

Turns out it's because the branch was not up to date with master. Rebased.

@justinchuby
Copy link
Collaborator Author

@pytorchbot merge -g

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks on your PR pass since you used the green (-g) flag (ETA: 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@justinchuby justinchuby changed the title [ONNX] Update training state logic to support ScriptedModule [ONNX] Reland: Update training state logic to support ScriptedModule Oct 14, 2022
atalman pushed a commit to atalman/pytorch that referenced this pull request Oct 21, 2022
…ytorch#86745)

In pytorch#86325, it was reported that ScriptedModule do not have a training attribute and will fail export because we don't expect them as input.

Also

- Parameterized the test_util_funs test

Thanks @borisfom for the suggestion!

Fixes pytorch#86325

Pull Request resolved: pytorch#86745
Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao
malfet pushed a commit that referenced this pull request Oct 21, 2022
…86745) (#87457)

In #86325, it was reported that ScriptedModule do not have a training attribute and will fail export because we don't expect them as input.

Also

- Parameterized the test_util_funs test

Thanks @borisfom for the suggestion!

Fixes #86325

Pull Request resolved: #86745
Approved by: https://github.com/AllenTiTaiWang, https://github.com/BowenBao

Co-authored-by: Justin Chu <justinchu@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: onnx Related to torch.onnx open source release notes: onnx torch.onnx related changes that should show up in the release notes Reverted topic: bug fixes topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ONNX] onnx.export fails on frozen top-level ScriptedModule
7 participants