Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifies scalar values casted to match input type. #678

Open
philloooo opened this issue May 6, 2024 · 6 comments · May be fixed by #647
Open

Specifies scalar values casted to match input type. #678

philloooo opened this issue May 6, 2024 · 6 comments · May be fixed by #647

Comments

@philloooo
Copy link
Contributor

philloooo commented May 6, 2024

Some of the operations have scalar parameters, e.g. linear, gemm's alpha and beta.
These are passed as float numbers. I think we should specify in the processing algorithm that they get casted to match with the input type. So that if input is float16, they also get downcast to float16.

Background is - in some cases, CoreML requires the parameter types to match. So we either downcast the scalar params to float16, or upcast input operands to float32, the latter is less ideal because on CoreML only float16 get executed on NPU.

Any concerns with adding this step in the algorithm?

@inexorabletash
Copy link
Contributor

Relevant:

@inexorabletash
Copy link
Contributor

Here's one way this could be specified, using linear() as an example.

  1. ...
  2. Let alpha be options.alpha, cast to input's dataType.
  3. Let beta be options.beta, cast to input's dataType.
  4. ...
  1. ...
  2. Let operator be an operator for the linear operation, given alpha and beta.
  3. ...

And then add a definition of cast like:

To cast a number number to a given MLOperandDataType dataType, perform the following steps:

  1. Switch on dataType
    ↪ "float32"
    Return the result of converting a JavaScript value to an IDL unrestricted float value with number.
    ↪ "float16"
    TODO
    ↪ "int32"
    Return the result of converting a JavaScript value to an IDL long value with number.
    ↪ "uint32"
    Return the result of converting a JavaScript value to an IDL unsigned long value with number.
    ↪ "int64"
    Return the result of converting a JavaScript value to an IDL long long value with number.
    ↪ "uint64"
    Return the result of converting a JavaScript value to an IDL unsigned long long value with number.
    ↪ "int8"
    Return the result of converting a JavaScript value to an IDL byte value with number.
    ↪ "uint8"
    Return the result of converting a JavaScript value to an IDL octet with number.

With caveats:

  • The input to this only makes sense if it's a (unrestricted) double which is equivalent to the JavaScript number type. See Introduce MLNumber for specifying numeric inputs of any type #647 for discussions around BigInt support, which is necessary for full-precision int64 inputs.
  • IDL doesn't have a half/float16 type (yet?) - we can copy/paste/tweak, reference https://tc39.es/proposal-float16array, etc
  • This intentionally gives the conversion the same result as if we had explicit IDL for the types, for example:
dictionary MLLinearFloat32Options {
  float alpha = 1;
  float beta = 1;
};
dictionary MLLinearFloat16Options {
  half alpha = 1; // this type doesn't actually exist
  half beta = 1; // this type doesn't actually exist
};
partial interface MLGraphBuilder {
  MLOperand linearFloat32(MLOperand input, optional MLLinearFloat32Options options = {});
  MLOperand linearFloat16(MLOperand input, optional MLLinearFloat16Options options = {});
};

... which is probably more relevant for integer types, because IDL+JS have well defined if sometimes surprising behavior here. See #489 for discussion about casting within backends, but at least at the IDL/interface level we should behave predictably.

@inexorabletash
Copy link
Contributor

More notes:

inexorabletash added a commit to inexorabletash/webnn that referenced this issue May 8, 2024
@inexorabletash
Copy link
Contributor

Also, what conversion options would we use for these? Specifically:

  • For casts to a floating point type, we can make this restricted (disallow +Infinity/-Infinity/NaN) or allow them
  • For casts to an integral type, we can make them act like [EnforceRange] (e.g. -1 cast to uint8 throws), [Clamp] (e.g. -1 cast to uint8 yields 0), or neither (C-style modulus e.g. -1 cast to uint8 yields 255)

This sounds like #489 but that issue is specifically about the casting operator, which might depend on the underlying platform. This issue is talking about the JS interface which is going to be implemented by the user agent.

inexorabletash added a commit to inexorabletash/webnn that referenced this issue May 8, 2024
- Introduce a "cast" definition that takes a number and a type, and returns the number cast to that type.

- Invoke cast during MLOperand and MLActivation creation.

TODO:

- Passing restrictions
  - Floating point - allow Infinities/NaNs or not?
  - Integer - throw or clamp if out of range?
- Simplify supported restrictions
- resample2d sizes option - is this part of the op data or not?
@inexorabletash
Copy link
Contributor

And then a further note for consideration: if we're being explicit about casting everywhere, are there any places that are currently float (that we're considering making double, see #325) that could be MLNumber as well so a developer could pass either 123456789123456789 or 123456789123456789n ?

This is fairly moot as all of the relevant ops accept only floating point inputs, so whatever the developer supplies will be ultimately cast to a float32 or float16, and so specifying the input as double rather than MLNumber doesn't lose detail, except the an exception will be thrown if 123456789123456789n is passed.

@inexorabletash
Copy link
Contributor

inexorabletash commented May 9, 2024

Unsorted notes from internal discussion w/ @a-sully and @philloooo:

For batchNorm's epsilon.

  • WebNN specfies that the input may be float16 or float32
  • in DML, epsilon is always a float32
  • in CoreML, epsilon can be float16 or float32
  • TFLite doesn't naively support batchNorm so it would be polyfilled (and then we'd need to reason about the data types in more detail)

And:

  • DML is basically float32 everywhere (elu's alpha, gemm's alpha/beta, hardSigmoid, etc); if the alpha is float32 but the input's data type is float16 what ends up happening?
  • CoreML in iOS15 it required these parameters to be the same dtype as the input; only in iOS17 was this restriction relaxed for many operators

So overall, the "cast parameters to the same data type as the input tensors" approach may be okay and likely the least surprising for developers even if the result gets upcast (e.g. fp16 to fp32) again, but there's a lot of nuance to investigate.

inexorabletash added a commit to inexorabletash/webnn that referenced this issue May 14, 2024
- Introduce a "cast" definition that takes a number and a type, and returns the number cast to that type.

- Invoke cast during MLOperand and MLActivation creation.

TODO:

- Passing restrictions
  - Floating point - allow Infinities/NaNs or not?
  - Integer - throw or clamp if out of range?
- Simplify supported restrictions
- resample2d sizes option - is this part of the op data or not?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants