-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebNN EP] Move MLContext creation to a singleton #20600
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Galli, very good starting for WebNN I/O binding support!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM % a nit, I will test it and let you know if there's any other issues. :)
The current implementation has a few problems:
|
In order to enable I/O Binding with the upcoming MLBuffer changes to the WebNN specification, we need to share the same MLContext across multiple sessions. This is because MLBuffers are tied to the MLContext where they were created.
I have moved the context de-duplication to a singleton in C++. |
} | ||
}; | ||
|
||
namespace onnxruntime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated namespace onnxruntime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not duplicated, the onnxruntime namespace ends on line 47. This is because it template specialization
for std::hash<::onnxruntime::WebNNContextOptions>
must be declared inside the std
namespace and before it is used on line 107 by InlinedHashMap<WebNNContextOptions, emscripten::val> contexts_;
.
ORT_THROW("Failed to get ml from navigator."); | ||
} | ||
|
||
emscripten::val context = ml.call<emscripten::val>("createContext", options.AsVal()).await(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how do you get global shared context from JS to create ml buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, this PR does not expose the context to JS. We'll need access to MLContext from JS to enable ort-web to download and upload data to the MLBuffer from JS.
@fs-eire do you have any suggestion on how to solve this issue without adding a WebNN TS backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with the Embind API, but I think the idea is the same:
Using the EMSCRIPTEN_BINDINGS() macro to export C++ functions to the Module object
. For example, a function called getCurrentMLContext
(takes no parameter) or getMLContext
(takes session ID as parameter) and returns the reference of the object. Then, you can call Module['getCurrentMLContext']()
in js_internal_api.js (because this file is included in the final JS glue using --pre-js
in emcc.
Please see also:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the actual usage of IO binding, users may need to access the MLContext object via ORT javascript API. It can be something like:
const mySession = await ort.InferenceSession.create('...', { executionProviders: [{
name: 'webnn',
...
}] });
const myMLContext = ort.env.webnn.getContext(mySession);
Considering this, there need to be a [SessionObject(JS)] to [MLContext(JS)] mapping in JavaScript. We can get [SessionObject(JS)] to [SessionID(JS&C++)] mapping, and [SessionID(JS&C++)] to [MLContext(JS)] mapping, so this is able to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, there is no public API to get the WebNNExecutionProvider*
from the [SessionID(JS&C++)](i.e. OrtSession*
) and Module.jsepSessionState.sessionHandle
is only valid during run()
/Compile()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. After checking the code, I realize that using session ID is not a good idea. There is no existing places that we can associate a session handle with a Web NN EP instance in the code.
In C++ we can just expose the WebNNContextManager::GetContext()
to JavaScript.
And in JavaScript API, we have 2 options:
-
Let user pass options directly:
const myWebNNOptions = { name: 'webnn', ... }; const mySession = await ort.InferenceSession.create('...', { executionProviders: [myWebNNOptions ] }); const myMLContext = ort.env.webnn.getContext(myWebNNOptions);
- No need to add extra JS code
-
Let user pass session object:
const mySession = await ort.InferenceSession.create('...', { executionProviders: [{ name: 'webnn', ... }] }); const myMLContext = ort.env.webnn.getContext(mySession);
- May be little bit easier to use for users, but require maintain a session-to-webnnOptions map in JS
A few changes to init.ts, backend-webnn.ts may have to add back, as we need to allow to assign object ort.env.webnn
to a property of Module
so that in C++ embind can access the object and add a property ( the getContext()
function) to it.
std::optional<std::string> device_type; | ||
std::optional<int> num_threads; | ||
std::optional<std::string> power_preference; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Is there a consideration why using string instead of enum types?
-
What about default value? For example, according to spec, if
deviceType
is not specified, it is considered 'cpu'. However, an instance ofWebNNContextOptions
withdeviceType==="cpu"
is considered a different option of not settingdeviceType
, although they should be the same.
Add a few comments here: There is a new issue (#20729) reveals a clearer picture of how an actual requirement would be. Users may want to manipulate with the Considering the latest spec: https://www.w3.org/TR/webnn/#api-ml-createcontext There will be a webnn-webgpu interop and |
Description
This PR moves the
MLContext
creation to a singleton. This enable us to share the sameMLContext
across multipleInterferenceSession
s.Motivation and Context
In order to enable I/O Binding with the upcoming MLBuffer API in the WebNN specification, we need to share the same
MLContext
across multiple sessions. This is becauseMLBuffer
s are restricted to theMLContext
where they were created.