Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requestResponseMessage handlers optimization #651

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Tsury
Copy link

@Tsury Tsury commented Dec 22, 2023

I initially opened it under #650 but then closed it after seeing @defunctzombie's #649 appearing 1 hour earlier (what are the odds?!). I then tested #649 and I fear that it does not improve performance, at least not against my tests.

My test (which is based on a real scenario I'm handling) sends approx 125k messages in a very short period. When I try it with either vanilla or #649, it starts by handling some 1-2k messages and then grinds to a crawl, practically never finishing. My PR processes everything in ~25 seconds.

When lowreing to ~72k messages, vanilla takes ~55 seconds to complete, #649 takes ~70 seconds, mine takes ~15 seconds.

-- ORIGINAL PR MSG--
As per #647, I followed @josephrocca's advice.
Now there's one central function that assigns handlers from a map.
I wasn't 100% sure where to add/remove the event listeners, open for suggestions.

My tests showed a ~20% performance increase in a mass-messages scenario, YMMV.

Instead of creating a new function per call.
}
const resolver = messageResolvers.get(data.id);
if (resolver) {
resolver(data);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would benefit from a try/finally to ensure that you always delete the processed ID.

@defunctzombie
Copy link

@Tsury 👏 What is different between this PR and my PR? The only different I notice is that you use a global map instance whereas I scoped the map instance to the wrapped created Proxy.

One other approach I explored was only making one map per endpoint instance: foxglove@be58259

I wonder if your usage pattern is such that you are making lots of createProxy calls?

It would be good to understand the nuances and also whether it seems "safe" to have one global map. One gotcha that comes to mind with the global map (and possibly even the local map) is holding onto resolve functions after an endpoint is torn down (potentially via .terminate() calls). The current behavior with "addEventListener" will teardown all of the handlers when the endpoint itself is gone since that's managed by the underlying system but with manual listener management more care needs to be taken or at least we need to think through whether it is possible to fail in such a way.

@Tsury
Copy link
Author

Tsury commented Dec 22, 2023

@Tsury 👏 What is different between this PR and my PR? The only different I notice is that you use a global map instance whereas I scoped the map instance to the wrapped created Proxy.

👋 I'm actually not entirely sure. Maybe the scoped usage just moves the optimization issue from the function creation to the map creation?

EDIT: Maybe it's the fact that you create a different function for every created proxy, and I reuse the same one? And that's probably one of the upsides for the global map usage - it doesn't require additional context in the form of an anonymous function.

One other approach I explored was only making one map per endpoint instance: foxglove@be58259

So this exact commit is a working, different approach? I can test it once I get back home to see how it performs.

EDIT: I tested foxglove@be58259 seems like it's on par with my code performance-wise, seems like we have two approaches that fix this performance issue. I guess the local option should be safer to use.

I wonder if your usage pattern is such that you are making lots of createProxy calls?

Not at all, just 3 createProxy executions exist in my test.

It would be good to understand the nuances and also whether it seems "safe" to have one global map. One gotcha that comes to mind with the global map (and possibly even the local map) is holding onto resolve functions after an endpoint is torn down (potentially via .terminate() calls). The current behavior with "addEventListener" will teardown all of the handlers when the endpoint itself is gone since that's managed by the underlying system but with manual listener management more care needs to be taken or at least we need to think through whether it is possible to fail in such a way.

I agree. We need to ensure that the managed approach does not introduce any issues due to dangling/missing listeners. I'll try to think of nuanced scenarios to test, appreciate any help with this - the more people tackling this the better.

@lvivski
Copy link

lvivski commented Jan 16, 2024

Thank you @defunctzombie and @Tsury for looking into this. Our project has ComLink bottlenecks in event listeners being added/removed all the time.

@surma @benjamind is it something that you might be interested in optimizing?

@defunctzombie
Copy link

I have not forgotten about this and we at Foxglove are looking at benchmarking this and working up an implementation that will meet the desired criteria.

@achim-k
Copy link

achim-k commented Jan 18, 2024

I have compared the three different versions (this PR, #649 and foxglove@be58259) with the current released comlink version and I can confirm that performance is overall better.

foxglove@be58259d8 perform best, mainly due to the simplification of how the random request id is generated. Without that, it is basically the same as this PR. For all three versions, the speedup is especially noticeable when increasing the number of parallel requests (-> more pending listeners). If there is mostly just one request at a time, then the performance improvement is only marginal. The table below shows the median time in ms for repeatedly (10000 times) calling a void function exposed by the worker:

parallel requests current comlink this PR foxglove@be58259d8 #649
1 842 788 (6% faster) 770 (8% faster) 823 (2% faster)
2 1074 929 (13% faster) 827 (23% faster) 1016 (5% faster)
5 2085 1507 (28% faster) 1393 (33% faster) 1647 (21% faster)

Note that for #649 it makes a huge difference if the function proxy is cached like shown below:

const obj = Comlink.wrap(worker);
const someFunctionProxy = obj.someFunction;
const numParallelRequest = 5;
for (let i = 0; i < 10_000; i++) {
  const promises = new Array(numParallelRequest).fill().map(() => someFunctionProxy());
  await Promise.all(promises);
}

If that is not done, a new proxy is created every time by calling await obj.someFunction(), which causes #649 to perform significantly worse than the current released version.

To move forward, I would suggest to


Update: I noticed that all three PRs have memory issues in that they keep a reference to temporarily created workers e.g. when doing the following in a onClick event callback:

const worker = new Worker("worker.js");
const obj = comlink.wrap<WorkerInterface>(worker);
obj.foo()
worker.terminate();

I will submit a PR which fixes this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants