Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Swish Op #5853

Open
vera121 opened this issue Jan 11, 2024 · 7 comments · May be fixed by #5964
Open

Request for Swish Op #5853

vera121 opened this issue Jan 11, 2024 · 7 comments · May be fixed by #5964
Labels
contributions welcome operator Issues related to ONNX operators

Comments

@vera121
Copy link

vera121 commented Jan 11, 2024

Swish/SiLU

Do you have any plans to implement the Swish Op in ONNX?

Describe the operator

Swish is a popular Activation fuction. Its mathematical definition could be found at https://en.wikipedia.org/wiki/Swish_function

TensorFLow has https://www.tensorflow.org/api_docs/python/tf/nn/silu
Keras has https://keras.io/api/layers/activations/ (also in https://www.tensorflow.org/api_docs/python/tf/keras/activations/swish)

Pytorch has https://pytorch.org/docs/stable/generated/torch.nn.SiLU.html

Can this operator be constructed using existing onnx operators?

Yes, it could be implemented as a combination of Mul and Sigmoid Ops:
x * Sigmoid (beta * x)

Is this operator used by any model currently? Which one?

Yes. Modern Yolo series like yolov5, yolov7, yolov8, yolop and EfficientNet all have such Swish Ops.

Yolov5: https://github.com/ultralytics/yolov5/blob/master/models/tf.py#L224

EfficientNet:
https://paperswithcode.com/method/efficientnet which has Swish in https://github.com/lukemelas/EfficientNet-PyTorch/blob/2eb7a7d264344ddf15d0a06ee99b0dca524c6a07/efficientnet_pytorch/model.py#L294

Are you willing to contribute it? (Y/N)

Possibly Yes.

Notes

@vera121 vera121 added the operator Issues related to ONNX operators label Jan 11, 2024
@justinchuby
Copy link
Contributor

Swish can be expressed as a combination of ONNX operators and can be easily fused by the backend. Was there a motivation for including it in the spec?

@vera121
Copy link
Author

vera121 commented Jan 17, 2024

Hello @justinchuby
One big advantage of adding this Op as a standalone Function is that we can then treat it as an Activation node directly. This is quite helpfule for model structure analysis (e.g. We can view it in netron etc.), and can also help us in Quantization design: If we need to quantize all the separate small Ops, this usually cause extra rounding errors and accuracy loss.

I suppose the benefits of implementing such a Spec is smilar to all the existing Activation Function in ONNX like Gelu, HardSwish, HardSigmoid, Mish etc.

@justinchuby
Copy link
Contributor

justinchuby commented Jan 17, 2024

@gramalingam @xadupre for more thoughts. I suggest creating a pull request if the op is desired. Additionally, it is possible to implement it as a model local function. You may then be able to use it like you mentioned above.

@justinchuby
Copy link
Contributor

Additionally, if you would like to see it as a single unit in the exported PyTorch models by torch.onnx.dynamo_export, you are welcome to contribute to https://github.com/microsoft/onnxscript/blob/bec23adc815406e6103dff8463e3386a1be155e7/onnxscript/function_libs/torch_lib/ops/nn.py#L2014

@aernoudt
Copy link

Swish can be expressed as a combination of ONNX operators and can be easily fused by the backend. Was there a motivation for including it in the spec?

@justinchuby, the same is true for other functions like Clip, HardSigmoid, HardSwish, etc

What are the criteria to add these as a function to the spec? According to AddNewOp.md the operator needs to be implemented by at-least one (well-known) framework which is the case for Swish/SiLU. Is this not sufficient?

@justinchuby
Copy link
Contributor

I understand ONNX is trying to keep the set of operators tight. Although this should be less of a concern when an op can be expressed as a function like in this case. I would bring this up in operators-sig and allow members of the sig to chime in.

@gramalingam
Copy link
Contributor

gramalingam commented Jan 31, 2024

My personal opinion is that adding this as a function-op to ONNX standard is reasonable: it can be easily supported by inline-expansion (if no fused implementation is available) or using a specialized kernel. (We can discuss this in the operator SIG meeting.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributions welcome operator Issues related to ONNX operators
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants