[WebNN EP] Move MLContext creation to a singleton #20600

egalli · 2024-05-07T23:40:54Z

Description

This PR moves the MLContext creation to a singleton. This enable us to share the same MLContext across multiple InterferenceSessions.

Motivation and Context

In order to enable I/O Binding with the upcoming MLBuffer API in the WebNN specification, we need to share the same MLContext across multiple sessions. This is because MLBuffers are restricted to the MLContext where they were created.

Honry

Thank you @Galli, very good starting for WebNN I/O binding support!

js/web/lib/wasm/jsep/backend-webnn.ts

onnxruntime/core/providers/webnn/webnn_execution_provider.cc

Honry · 2024-05-08T01:13:30Z

@fs-eire, @guschmue, @fdwr, PTAL, thanks!

Honry

Generally LGTM % a nit, I will test it and let you know if there's any other issues. :)

js/web/lib/wasm/jsep/backend-webnn.ts

onnxruntime/core/providers/webnn/webnn_execution_provider.cc

js/web/lib/wasm/jsep/backend-webnn.ts

js/web/lib/wasm/binding/ort-wasm.d.ts

js/common/lib/inference-session.ts

js/web/lib/wasm/binding/ort-wasm.d.ts

js/web/lib/wasm/wasm-core-impl.ts

fs-eire · 2024-05-09T21:51:54Z

The current implementation has a few problems:

If user specify multiple execution providers (which is legit in ORT) like ['webgpu', { name: 'webnn', powerPreference: 'high-performance' }], it is hard to tell from JavaScript code that which EP is currently in use. As long as WebNN is initialized, JavaScript has no idea which EP will actually work with the current model - and by reading from execution providers config is not sufficient to tell.
The current implementation splited the code of initialization process into multiple places. The C++ code does a few things and the Javascript does some others. The [webnn session option] to [context ID] is a 1:1 mapping, and the [session ID] to [context ID] is a multi-to-one mapping. In my understanding, using a singleton map in C++ should be a better way to implement this requirement. This is because it's much easier to read and understand and putting all related code together if possible to reduce the chance to introduce bugs in future changes. Please let me know if I understand this part wrong.
Modifying the user input (session options) is a concern. This is usually not an expected behavior however the current implementation depends on adding a property to a user specified session options.

In order to enable I/O Binding with the upcoming MLBuffer changes to the WebNN specification, we need to share the same MLContext across multiple sessions. This is because MLBuffers are tied to the MLContext where they were created.

egalli · 2024-05-10T18:23:52Z

I have moved the context de-duplication to a singleton in C++.

Honry · 2024-05-13T02:16:36Z

onnxruntime/core/providers/webnn/webnn_execution_provider.cc

+  }
+};
+
+namespace onnxruntime {


Duplicated namespace onnxruntime.

It is not duplicated, the onnxruntime namespace ends on line 47. This is because it template specialization
for std::hash<::onnxruntime::WebNNContextOptions> must be declared inside the std namespace and before it is used on line 107 by InlinedHashMap<WebNNContextOptions, emscripten::val> contexts_;.

Honry · 2024-05-13T03:03:50Z

onnxruntime/core/providers/webnn/webnn_execution_provider.cc

+      ORT_THROW("Failed to get ml from navigator.");
+    }
+
+    emscripten::val context = ml.call<emscripten::val>("createContext", options.AsVal()).await();


So how do you get global shared context from JS to create ml buffer?

That's a good point, this PR does not expose the context to JS. We'll need access to MLContext from JS to enable ort-web to download and upload data to the MLBuffer from JS.

@fs-eire do you have any suggestion on how to solve this issue without adding a WebNN TS backend?

I am not familiar with the Embind API, but I think the idea is the same:

Using the EMSCRIPTEN_BINDINGS() macro to export C++ functions to the Module object. For example, a function called getCurrentMLContext (takes no parameter) or getMLContext (takes session ID as parameter) and returns the reference of the object. Then, you can call Module['getCurrentMLContext']() in js_internal_api.js (because this file is included in the final JS glue using --pre-js in emcc.

Please see also:

https://emscripten.org/docs/api_reference/bind.h.html#_CPPv419EMSCRIPTEN_BINDINGS4name

https://emscripten.org/docs/porting/connecting_cpp_and_javascript/embind.html#a-quick-example

Considering the actual usage of IO binding, users may need to access the MLContext object via ORT javascript API. It can be something like:

const mySession = await ort.InferenceSession.create('...', { executionProviders: [{ name: 'webnn', ... }] }); const myMLContext = ort.env.webnn.getContext(mySession);

Considering this, there need to be a [SessionObject(JS)] to [MLContext(JS)] mapping in JavaScript. We can get [SessionObject(JS)] to [SessionID(JS&C++)] mapping, and [SessionID(JS&C++)] to [MLContext(JS)] mapping, so this is able to implement.

Unless I'm missing something, there is no public API to get the WebNNExecutionProvider* from the [SessionID(JS&C++)](i.e. OrtSession*) and Module.jsepSessionState.sessionHandle is only valid during run()/Compile().

I see your point. After checking the code, I realize that using session ID is not a good idea. There is no existing places that we can associate a session handle with a Web NN EP instance in the code.

In C++ we can just expose the WebNNContextManager::GetContext() to JavaScript.

And in JavaScript API, we have 2 options:

Let user pass options directly:

const myWebNNOptions = { name: 'webnn', ... }; const mySession = await ort.InferenceSession.create('...', { executionProviders: [myWebNNOptions ] }); const myMLContext = ort.env.webnn.getContext(myWebNNOptions);

No need to add extra JS code

Let user pass session object:

const mySession = await ort.InferenceSession.create('...', { executionProviders: [{ name: 'webnn', ... }] }); const myMLContext = ort.env.webnn.getContext(mySession);

May be little bit easier to use for users, but require maintain a session-to-webnnOptions map in JS

A few changes to init.ts, backend-webnn.ts may have to add back, as we need to allow to assign object ort.env.webnn to a property of Module so that in C++ embind can access the object and add a property ( the getContext() function) to it.

fs-eire · 2024-05-16T23:43:12Z

onnxruntime/core/providers/webnn/webnn_execution_provider.cc

+  std::optional<std::string> device_type;
+  std::optional<int> num_threads;
+  std::optional<std::string> power_preference;


Is there a consideration why using string instead of enum types?

What about default value? For example, according to spec, if deviceType is not specified, it is considered 'cpu'. However, an instance of WebNNContextOptions with deviceType==="cpu" is considered a different option of not setting deviceType, although they should be the same.

fs-eire · 2024-05-20T21:30:00Z

Add a few comments here:

There is a new issue (#20729) reveals a clearer picture of how an actual requirement would be. Users may want to manipulate with the MLContext with more flexibility. I am currently thinking about it may be a good idea to let user to create the MLContext and just pass it to ORT via session options.

Considering the latest spec: https://www.w3.org/TR/webnn/#api-ml-createcontext

There will be a webnn-webgpu interop and createContext() may accept an WebGPU gpuDevice object. This will be even more difficult to implement inside ORT so just let users to do their part.

Honry reviewed May 8, 2024

View reviewed changes

js/web/lib/wasm/jsep/backend-webnn.ts Outdated Show resolved Hide resolved

js/web/lib/wasm/jsep/backend-webnn.ts Outdated Show resolved Hide resolved

onnxruntime/core/providers/webnn/webnn_execution_provider.cc Outdated Show resolved Hide resolved

Honry approved these changes May 8, 2024

View reviewed changes

js/web/lib/wasm/jsep/backend-webnn.ts Outdated Show resolved Hide resolved

onnxruntime/core/providers/webnn/webnn_execution_provider.cc Show resolved Hide resolved

Honry reviewed May 8, 2024

View reviewed changes

js/web/lib/wasm/jsep/backend-webnn.ts Outdated Show resolved Hide resolved

Honry reviewed May 8, 2024

View reviewed changes

js/web/lib/wasm/jsep/backend-webnn.ts Outdated Show resolved Hide resolved

fs-eire reviewed May 8, 2024

View reviewed changes

js/web/lib/wasm/binding/ort-wasm.d.ts Outdated Show resolved Hide resolved

fs-eire reviewed May 9, 2024

View reviewed changes

js/common/lib/inference-session.ts Outdated Show resolved Hide resolved

fs-eire reviewed May 9, 2024

View reviewed changes

js/web/lib/wasm/binding/ort-wasm.d.ts Outdated Show resolved Hide resolved

fs-eire reviewed May 9, 2024

View reviewed changes

js/web/lib/wasm/binding/ort-wasm.d.ts Outdated Show resolved Hide resolved

fs-eire reviewed May 9, 2024

View reviewed changes

js/web/lib/wasm/wasm-core-impl.ts Outdated Show resolved Hide resolved

[WebNN EP] Move MLContext creation to a singleton

81695d1

In order to enable I/O Binding with the upcoming MLBuffer changes to the WebNN specification, we need to share the same MLContext across multiple sessions. This is because MLBuffers are tied to the MLContext where they were created.

egalli force-pushed the move_ml_context branch from 064865e to 81695d1 Compare May 10, 2024 18:20

egalli changed the title ~~[WebNN EP] Move MLContext creation to TypeScript~~ [WebNN EP] Move MLContext creation to a singleton May 10, 2024

Honry reviewed May 13, 2024

View reviewed changes

fs-eire reviewed May 16, 2024

View reviewed changes

fs-eire mentioned this pull request May 20, 2024

[Web] executionProviders chain for webnn fallback does not work on init error #20729

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WebNN EP] Move MLContext creation to a singleton #20600

[WebNN EP] Move MLContext creation to a singleton #20600

egalli commented May 7, 2024 •

edited

Honry left a comment

Honry commented May 8, 2024

Honry left a comment

fs-eire commented May 9, 2024 •

edited

egalli commented May 10, 2024 •

edited

Honry May 13, 2024

egalli May 13, 2024

Honry May 13, 2024

egalli May 13, 2024

fs-eire May 15, 2024

fs-eire May 15, 2024

egalli May 16, 2024 •

edited

fs-eire May 16, 2024

fs-eire May 16, 2024

fs-eire commented May 20, 2024 •

edited

[WebNN EP] Move MLContext creation to a singleton #20600

Are you sure you want to change the base?

[WebNN EP] Move MLContext creation to a singleton #20600

Conversation

egalli commented May 7, 2024 • edited

Description

Motivation and Context

Honry left a comment

Choose a reason for hiding this comment

Honry commented May 8, 2024

Honry left a comment

Choose a reason for hiding this comment

fs-eire commented May 9, 2024 • edited

egalli commented May 10, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

egalli May 16, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fs-eire commented May 20, 2024 • edited

egalli commented May 7, 2024 •

edited

fs-eire commented May 9, 2024 •

edited

egalli commented May 10, 2024 •

edited

egalli May 16, 2024 •

edited

fs-eire commented May 20, 2024 •

edited