Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce MLNumber for specifying numeric inputs of any type #647

Draft
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

inexorabletash
Copy link
Contributor

@inexorabletash inexorabletash commented Apr 18, 2024

For some MLGraphBuilder methods the type of a numeric input can vary - e.g. for constant() an explicit MLOperandDataType is provided; for clamp() and pad() the data type is implied by input operands. In these cases, specifying the numeric value as either a float/double or int64 type runs into accuracy or range issues - you can't accurately represent all int64 values as a double, and you can't represent the full range of floats as int64. (You also can't represent all int64 values as an long long either - over 2^53 things get wierd. But that's a digression.)


Preview | Diff

For some MLGraphBuilder methods the type of a numeric input can vary -
e.g. for constant() an explicit MLOperandDataType is provided; for
clamp() and pad() the data type is implied by input operands. In these
cases, specifying the numeric value as either a float/double or int64
type runs into accuracy or range issues - you can't accurately
represent all int64 values as a double, and you can't represent the
full range of floats as int64. (You also can't represent all int64
values as an long long either - over 2^53 things get wierd. But that's
a digression.)

Per discussion in whatwg/webidl#1388 this
change introduces a union between a bigint type and unrestricted
double called MLNumber. Callers can pass a JS number (1234, 1.1234e38)
or a JS bigint (9007199254740993n), and the implementation will treat
it properly based on the explicit or implicit MLOperandDataType. Usage
of this type should be limited to only those cases.

Fixes webmachinelearning#442

Note that webmachinelearning#492 proposes changes to the constant sequential filling
operation; this just adds IDL to match the current spec prose.

Some of the concerns raised in webmachinelearning#325 are addressed (e.g. clamp()'s
options). However, several other options are still specified as
"float", and should maybe be "double" - but MLNumber is likely not
appropriate for those, so they are not updated here.
@inexorabletash
Copy link
Contributor Author

FYI @fdwr @huningxin @zolkis - marked as "draft" but worth looking at.

  • We may want to settle Constant sequential filling operation needs output shape parameter #492 before landing this - and "unrestricted double" may not be appropriate (since we want to preclude Infinity for fill ops?)
  • Is there anywhere that we want to error on NaNs but allow Infinities? If so we need explicit prose steps.
  • bigint is not just int64 - it allows arbitrary sized ints (e.g. 2n**100n). If we want to error for a non-valid int64, we need explicit prose steps.
  • I put the MLNumber definition and prose in the least bad place I could think of; suggestions welcome!

index.bs Outdated Show resolved Hide resolved
Comment on lines +1654 to +1655
MLNumber minValue;
MLNumber maxValue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the Chromium implementation assumes that these values are finite (not unrestricted) floats. In the case of clamp it seems reasonable for infinite values to be passed. Barring any pushback, I'd like to propose we change the implementation to allow infinites

That being said, should the clamp() function's algorithm throw on NaN?

Now that we're allowing unrestricted doubles there's a separate question of whether minValue should default to -Infinity and maxValue should default to Infinity... Probably best to leave that to a follow-up :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to propose we change the implementation to allow infinites

Yes, very much so.

That being said, should the clamp() function's algorithm throw on NaN?

Well NaN propagation is well defined, and there are helper functions specifically intended to check for NaN's during computation (1 2 3 4...). So NaN isn't really a foreign concept or illegal concept in ML (then again, maybe it's just because I'm so intimately familiar with dozens of floating-point formats :b). I'd also expect clamp to behave like other libraries. e.g.

x = tensorflow.constant([250], dtype=tensorflow.float32)
y = tensorflow.clip_by_value(x, 13, float('nan'))
print("value:", y)
print("shape:", y.shape)

# Prints:
# value: tf.Tensor([nan], shape=(1,), dtype=float32)
# shape: (1,)

Now that we're allowing unrestricted doubles there's a separate question of whether minValue should default to -Infinity and maxValue should default to Infinity... Probably best to leave that to a follow-up :)

Those are sensible defaults. We just have to be sure that the logic inside Chromium for integer data types like int32 respectively maps +Infinity to INT32_MAX and -Infinity to INT32_MIN. Agreed that it's cleaner to be a separate CR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to propose we change the implementation to allow infinites

+1. The previous IDL type float doesn't allow setting Infinity and NaN.

Now that we're allowing unrestricted doubles there's a separate question of whether minValue should default to -Infinity and maxValue should default to Infinity...

I suppose Inifinity is only for floating-point values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll add support for infinites and NaN in clamp when I add support for bigint in the bindings code

I'd also expect clamp to behave like other libraries. e.g.

Yup, SGTM. We should add WPT conformance tests asserting that we get the expected behavior for inf/NaN (FYI @BruceDai)

index.bs Outdated
@@ -1133,6 +1135,16 @@ The {{MLOperand}} objects are created by the methods of {{MLGraphBuilder}}, inte
To <dfn for="MLGraphBuilder">validate operand</dfn> given {{MLGraphBuilder}} |builder| and {{MLOperand}} |operand|, return true if |operand|.{{MLOperand/[[builder]]}} is |builder|, and false otherwise.
</p>

#### {{MLNumber}} #### {#api-mlnumber-typedef}

<dfn typedef>MLNumber</dfn> is used when specifying the type of a numeric option for an {{MLOperand}} which can be of any {{MLOperandDataType}}, including both 64-bit integer types ({{MLOperandDataType/"uint64"}} and {{MLOperandDataType/"int64"}}) and 32-bit floating point ({{MLOperandDataType/"float32"}}). Implementations must process the value according to the corresponding {{MLOperandDataType}}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementations must process the value according to the corresponding {{MLOperandDataType}}.

What if there is no corresponding MLOperandDataType? For example, clamp() can be built as an MLActivation without needing to specify any particular dtype. What is the dtype or the activation's associated operator?

The concept of an activation's operator is a bit hand-wavy in the spec, but it's very concrete in the Chromium implementation and the data type must be known when we pass the activation as an input to some other builder method anyways (we need to check that the types match, assuming we don't want to allow mixed precision #283 (comment))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the data type of the eventual operator's input? i.e. for the impl we'd need to hold onto the union in the MLActivation until the operator is constructed?

We need to improve the spec text, but I want to understand the implementation first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the data type of the eventual operator's input?

Fused activations should use operator's output as its own input, although the output's data type of operators that support fusion is usually the same as the input's.

i.e. for the impl we'd need to hold onto the union in the MLActivation until the operator is constructed?

I think so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a short note in 67b5a68 but we can probably improve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. for the impl we'd need to hold onto the union in the MLActivation until the operator is constructed?

I think so.

Does this make sense? What happens if an MLActivation is passed to multiple operators with different data types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minValue <= maxValue is particularly interesting depending on the conversions. If you pass {minValue: -1, maxValue: 128} this looks valid, but if the data type ends up being "uint8", does this turn into {minValue: 0xFF, maxValue: 0x80} ? Ooof.

with the idea that an MLActivation has one operator slot

Agreed that "the whole "operator" concept is pretty hand-wavy anyways". An MLActivation's internal [[operator]] slot is not the specific operator instance of the MLOperand that the activation is fused with, and is probably more like an operator type than a specific instance. So I don't think there's a conflict, but we could definitely improve the text - and explicitly mention that MLActivations can be re-used when creating multiple MLOperands and even with different data types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be explicit, here's where I'm envisioning the conversion/validation happens:

  • for constant() - immediate conversion based on explicitly passed MLOperandDataType
  • for MLOperand clamp() - immediate conversion based on input's data type
  • for MLActivation clamp() - when the activation is passed to an operand-vending method
  • for pad() - immediate conversion based on input's data type

See #649 (comment) for a framework where activation validation could be plugged in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question:
Does this mean for example for gemm, if the input a and b are passed as float16, and if you pass
{alpha: MLNumber(3.14123452435)}, it will automatically convert that to float16 as 3.141?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess under the hood - yes (although it'd be just {alpha: 3.14123452435})

Presumably that's what the spec/implementation implicitly does today, even if we didn't introduce MLNumber?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under the hood it preserves the fp32 precision right now (as what you are suggesting).
But I'd prefer we explicitly set the expectation that it will be casted to the precision to be the same as the input type.
In the gemm example, since it's emulated as alpha * matmul(a,b) + beta * c in CoreML, and CoreML requires all these binary ops to have the same type. so if a, b are fp16, then it's better to cast alpha to fp16 to multiply them. Otherwise we need to cast input to fp32 (which doesn't get executed on ANE).

So right now we just don't support fp16 inputs for gemm on CoreML until this is sorted out.

@inexorabletash
Copy link
Contributor Author

FYI, I updated this PR with changes for explicit casting (#678), and changing float inputs to double (#325)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants