New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Swish operator #5964
base: main
Are you sure you want to change the base?
Add Swish operator #5964
Conversation
Would appreciate some guidance on adding tests to |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5964 +/- ##
==========================================
+ Coverage 56.95% 57.01% +0.06%
==========================================
Files 506 507 +1
Lines 30467 30951 +484
Branches 4592 4593 +1
==========================================
+ Hits 17353 17648 +295
- Misses 12285 12478 +193
+ Partials 829 825 -4 ☔ View full report in Codecov by Sentry. |
.SetDoc(SiLU_ver21_doc) | ||
.Input(0, "X", "Input tensor", "T", OpSchema::Single, true, 1, OpSchema::Differentiable) | ||
.Output(0, "Y", "Output tensor", "T", OpSchema::Single, true, 1, OpSchema::Differentiable) | ||
.TypeConstraint( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beta is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are using SiLU I'm assuming beta is 1.0? I can add it tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice HardSwish uses alpha and beta in the form alpha*x + beta
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on #5964 (comment) I will add beta
and rename the operator to swish
, thanks for reviewing this!
onnx/defs/math/defs.cc
Outdated
@@ -644,6 +644,25 @@ ONNX_OPERATOR_SET_SCHEMA( | |||
.SetContextDependentFunctionBodyBuilder(BuildContextDependentFunctionBodyGelu) | |||
.TypeAndShapeInferenceFunction(propagateShapeAndTypeFromFirstInput)); | |||
|
|||
static const char* SiLU_ver21_doc = R"DOC( | |||
Sigmoid Linear Unit (SiLU), or known as the Swish function, takes one input data (Tensor<T>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we have HardSwish
, would it be more uniform to call this function-op Swish
? Just a minor matter, just wondering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea! I will update this to swish
and add the beta parameter.
46a4ce8
to
e38a91a
Compare
onnx/defs/math/defs.cc
Outdated
.FunctionBody( | ||
R"ONNX( | ||
{ | ||
S_X = Sigmoid<beta = 1.0>(X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sigmoid doesn't have an attribute. We should be doing something like below:
Beta = Constant <value_float: float = @beta>()
BetaCast = CastLike (Beta, X)
BetaX = Mul (BetaCast, X)
SigmoidBetaX = Sigmoid(BetaX)
Y = Mul (X, SigmoidBetaX)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, as commented above, it seems to me we should be calling this alpha
instead of beta
, based on the convention in HardSwish
.
onnx/defs/math/defs.cc
Outdated
|
||
ONNX_OPERATOR_SET_SCHEMA( | ||
Swish, | ||
21, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will now need to be 22
@@ -1198,6 +1198,9 @@ def test_Sigmoid(self) -> None: | |||
def test_Sign(self) -> None: | |||
self._test_op_upgrade("Sign", 9) | |||
|
|||
def test_Swish(self) -> None: | |||
self._test_op_upgrade("Swish", 21) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now 22
f737693
to
2633a85
Compare
Thanks for the review!! One followup question: when I run both commands after updating the operator to 22, there are ~20 |
75ae9df
to
b0ceb4d
Compare
a29a276
to
66233c4
Compare
Signed-off-by: isdanni <leedanni@gmail.com>
Description
These changes have been made to support the SiLU/Swish operator as a function op.
Motivation and Context
Close #5853
The SiLU function can be defined as$x * Sigmoid (beta * x)$ where $beta = 1.0$ .
Swish ops is used in Yolo series and EfficientNet.