Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-WebSocket transport alternative? #840

Open
jonathanperret opened this issue Feb 2, 2024 · 7 comments
Open

Non-WebSocket transport alternative? #840

jonathanperret opened this issue Feb 2, 2024 · 7 comments

Comments

@jonathanperret
Copy link
Contributor

Hello,

We'be been trying to onboard potential Grist users that are behind an HTTP proxy which doesn't allow WebSocket traffic (either HTTP/1.1 WebSockets or the new HTTP/2 variant). The exclusion of WebSocket traffic is deliberate on the part of the network administrator, so while we are trying to change their mind, we hold little hope of a change in policy in the short/medium term.

How feasible do you think it would be to support an alternative to WebSockets such as HTTP long polling? I'm willing to look into the implementation but I thought I'd share the idea first.

@paulfitz
Copy link
Member

paulfitz commented Feb 2, 2024

Hi @jonathanperret, I think it would be very feasible to have a long polling fallback. We kind of expected we'd need one at some point, but the years went by and problems were too few and far between to justify it from our perspective. If you're interested in implementing, I think we could point you in the right direction.

@jonathanperret
Copy link
Contributor Author

Hi @paulfitz , this is great to hear. Not promising anything but I'd be interested if you have pointers beyond what a quick scan of the code seems to show, i.e. the only files that seem to manipulate WebSockets directly (outside of tests) are app/server/lib/Comm.ts and app/server/lib/Client.ts for the server, and app/client/components/Comm.ts and app/client/components/GristWSConnection.ts for the client.

If this is correct, the area of impact for an alternative transport seems relatively small. An important choice that remains open is the implementation for an alternative to raw WebSockets: a WebSocket emulation layer like https://github.com/sockjs? A full-fledged library like Socket.IO? Or a hand-written long polling implementation? I would love to have your opinion on this.

@paulfitz
Copy link
Member

paulfitz commented Feb 5, 2024

@jonathanperret for the code, exactly, you've found the main files involved. It is a very limited part of Grist. I will admit up front that I've found it a somewhat awkward part of Grist to work with, since it is some of the oldest code and has suffered through a lot of evolution. You'll need some patience. On the other hand, I'm excited about the possibility of some clean-up there!

For implementation, we're open to tasteful use of well tested libraries that are sufficiently slim on the client side.

@jonathanperret
Copy link
Contributor Author

Hello @paulfitz, to give a quick update: I've started looking into this more seriously.

Library selection

I looked at the available implementations of HTTP long polling — of which there aren’t many still being developed, since most projects seem to take native WebSocket availability for granted these days. In fact the only recently maintained libraries I could find are the well-known Socket.IO and SockJS. SockJS seemed interesting at first as it is transport-focused (basically emulating WebSocket over other transports), compared to Socket.IO which embeds higher-level functionality such as broadcasting, multiplexing, buffering, acknowledgement, and the abilty to use multiple backends for session management, which I was afraid would be overkill for Grist’s needs.
However SockJS seems to be unfunded currently, and while there have been commits relatively recently to the repositories, the sockjs.org domain is now cybersquatted, and no release has occurred since 2022 for the client and 2021 for the server.
By comparison Socket.IO seems quite alive, with multiple active contributors and a frequently updated website.

As to the worries I had about the advanced features of Socket.IO overlapping with existing code in Grist, I think they can be assuaged by uisng only the low-level part of Socket.IO which is Engine.IO. This contains only the « transport » part of Socket.IO, i.e. a thin wrapper around WebSocket (for the browser) and ws (for the server), plus a long-polling implementation with minimal dependencies.

The Engine.IO client would add about 24 kilobytes before gzipping (the size of engine.io.esm.min.js in https://unpkg.com/browse/engine.io-client@6.5.3/dist/) to the main client bundle (which currently sits at around 1.31 MB).

Of note is that the Engine.IO protocol adds some framing even when using native WebSockets, so client and server would have to migrate together even if they end up talking over WebSockets as they currently do.

Implementation progress

I first started by isolating references to the WebSocket API and the ws API into new classes (tentatively) called GristClientSocket, GristSocketServer and GristServerSocket. I then reduced the API surface of those classes (e.g. replacing generic .on(event, handler) methods with specific properties exposing only the events used by the calling code), in preparation for an alternative, Engine.IO-based implementation. I made sure to keep the whole test suite passing while doing this refactoring.

I then added Engine.IO-using implementations of these classes, together with a switching mechanism based on a GRIST_USE_ENGINE_IO environment variable.

The current status is that editing a Grist document seems to work when setting GRIST_USE_ENGINE_IO=yes, even with WebSockets disabled (tested by harcoding transports = [‘polling’] in the Engine.IO setup on the client and server).

However, I am still struggling a bit to get the tests to pass. The Comm tests in particular are quite dependent on a specific sequence of events when transport errors occur, and I have yet to figure out how to reproduce that sequence when switching to Engine.IO which has different semantics from WebSocket/ws around these cases. But I’m optimistic that with more traces I’ll work it out soon.

If you want to follow the work in progress, or give early feedback, I’ve been working on a branch here: https://github.com/jonathanperret/grist-core/tree/engine.io (comparison with current main ). Should I open a draft PR now, or wait until I have something more ready for review?

@paulfitz
Copy link
Member

@jonathanperret great work, that is good research, and your conclusion to use Engine.IO makes sense to me. Thanks for flagging the issue about upgrading client and server in lockstep - that is fine. Glad you are making progress with implementation. Understandable that the tests could be giving trouble. My colleague @dsagal may be able to give some guidance if you need it. Use your own judgment on when to open a draft PR. Doing it earlier rather than later may make it a little easier for Dmitry to chip it. Thanks a lot for working on this!

@dsagal
Copy link
Member

dsagal commented Feb 17, 2024

Thank you @jonathanperret for digging into this!

Regarding Comm tests, I recall some particular headaches:

  1. One has to do with when the server detects a connection closed by the client. A "send" on a closed connection may fail before or after the close event happens. The difference caused a rare bug, but it was hard to reproduce this in a test case. The particular approach to testing this may be overly complicated; maybe there is a better way.
  2. There used to be a bug, at least with one browser/platform we support, where after making a websocket connection, the first message sent on it was sometimes not received. After much effort to figure out what's wrong in our code, I decided it really was an underlying bug, not in our code. We deal with it by sending the first "clientConnect" message twice. This means the client has to ignore the duplicate, and tests have to be aware of this too.

I understand your branch isn't ready to review, but I browsed through it, and noticed an issue in app/server/lib/GristServerSocket.ts which might affect tests: in lines 43 and 48, the handler added to this._eventHandlers isn't exactly the same handler as got passed into this._socket.on, so removeAllListeners cannot remove it.

@jonathanperret
Copy link
Contributor Author

Thank you both for the valuable feedback.

@dsagal, good catch on removeAllListeners. I missed that when introducing a wrapper for the close handler, made necessary by the fact that Engine.IO’s send callbacks are only called upon success — but every failure ends in close being emitted so I figured it would be OK to invoke callbacks for the messages still in flight by then. I will probably end up wrapping all three events anyway since it looks like more work is needed to « impedance match » the WebSocket and Engine.IO APIs.

The added context for the Comm tests is also welcome. As mentioned above, I have already had to tweak the emission of close events and callbacks; with that added information I hope to finally get the sequence right.

Unfortunately I have found a subtle bug in Engine.IO while working on this, which causes send callbacks not to be emitted for messages sent in a specific window of time after a connection is established — in Grist this manifests as a session occasionally getting « stuck » shortly after being established. I reported it here and hopefully a resolution can be found soon. In the meantime I have set the ping interval to once per second which gets the session unstuck but is obviously unsatisfactory.

I’ve gone ahead and opened a draft PR : #859 . I’ll keep it updated as I work, and I suggest we move further discussion of the implementation there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants