Core generic tensor ops #3024

gramalingam · 2020-09-16T18:36:34Z

gramalingam
Sep 16, 2020
Maintainer

Background and Motivation:

One of the challenges faced in a standard like ONNX is the tradeoff between expressiveness and efficiency. This often manifests itself as a choice between generic ( or low-level) ops and specialized (or high-level) ops. Generic ops (for example, Scan in ONNX) can be composed to express far more models and lead to greater expressiveness. But specialized ops (for example, RNN or GRU or LSTM) can be implemented very efficiently, taking advantage of their specific requirements.

ONNX relies on the idea of supporting both kinds of ops and linking them using the concept of functions, which effectively define specialized ops in terms of other, lower-level or more generic, ops. This allows a backend that has a very efficient implementation of the specialized op to exploit that, while allowing other backends to rewrite the specialized op in terms of the other generic ops and use an implementation of the generic ops as a fallback or default implementation.

The MLIR project also exploits a similar approach, using the idea of different layers (or dialects) and using lowerings from one dialect to another.

Proposal

It will be beneficial to add a few generic tensor ops allowing us to achieve the above goals effectively. The suggested ops are:

Unary element-wise op
Binary element-wise op (with and without broadcasting)
Reduction op
N-ary version of binary element-wise op
Pooling op
and so on.

These will be complemented by scalar ops that operate only on scalar values and return scalar values (such as Tanh, log, addition, subtraction). All the above tensor-ops will take the form of higher-order ops that take a scalar computation (graph) as an attribute.

In the MLIR terminology, we can think of the above as two dialects: a scalar dialect (effectively at the LLVM level) and a tensor dialect. The tensor ops suggested above are slightly higher-level than the Generic op in the LinAlg dialect of MLIR, which is itself inspired by the common forms of loops encountered in this setting (such as parallel loop and reduction loop).

The idea behind these core generic tensor ops is to have greater extensibility and expressiveness without compromising efficiency. Thus, these ops capture the key attributes that are exploited in an efficient implementation, while parametrizing over aspects that are not that critical for these optimizations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core generic tensor ops #3024

{{title}}

Replies: 0 comments

Select a reply

Core generic tensor ops #3024

gramalingam Sep 16, 2020 Maintainer

Background and Motivation:

Proposal

Replies: 0 comments

gramalingam
Sep 16, 2020
Maintainer