Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: connect ETIMEDOUT issue #1403

Open
MattSmithTR opened this issue Feb 27, 2024 · 1 comment
Open

Error: connect ETIMEDOUT issue #1403

MattSmithTR opened this issue Feb 27, 2024 · 1 comment

Comments

@MattSmithTR
Copy link

Hello! I apologize if this is a question that has been covered before, I have done my best to try to look up anything that could help and I have had no such luck. I have implemented this Java-Websockets library into my application.

The application is a Spring Boot Java web application in the backend and uses Angular in the frontend. I can't get into too many specifics into what the application does as it is a corporate web app. Essentially, I use websockets to keep track of users working on a document at the same time (similar to the way you can see fellow users when you are working on a Google Document).

We have multiple environments and multiple applications that are involved in keeping track of this info. Here are the steps into what a user might do and how we keep track of this info:

  1. User 1 loads application A in the TEST environment and accesses document 123
  2. User 2 loads application B in the TEST environment and accesses document 123
  3. Since they are in the same document they both need to be aware of their presence - User 1 will get a notification that User 2 is working on the same document as them (in a different application), and User 2 will also be notified.
  4. How we do this is essentially storing user info, document info, and app info into our DB and using the websockets to display back real time data when the users access the same document.

We have 4 lower environments and 1 production environment. QA and PROD run on 6 Red Hat Linux servers - DEV, TEST, and UAT run on 2 Red Hat Linux servers.

We released our Web Socket code a year ago and through testing in all lower environments everything appeared to have worked as expected. We then released to Production (keep in mind, during work hours we have about 200 users who actively use the application) and things went very wrong. There were many inconsistencies in users being able to see when others were accessing the same document. The error upon investigation came down to this: Error: connect ETIMEDOUT Could not connect to ws:// This error upon establishing its first connection with the websocket server. I also want to point out that not every connection fails - it seems to be intermittent.

There is a massive lack of understanding online on how to approach this issue. The only commonality I can find is that for us this only starts happening when there are over 100 users actively using our applications at once. If there is less than that, the websockets have little to no issues establishing a connection to the server. The only thing we added to help mitigate the issue a little bit is some logic that retries different web socket servers multiple times if we happen to fail a connection. This has helped but does not completely fix the issue.

My questions are as follows - what can we do going forward, is there a way to fix a Timeout issue like this? What are common causes that can allow this to happen, is there something deeper that my team and I are missing? Is there a limit to how many connections a web socket server can allow? Upon googling I thought I remember seeing something like 65,000 connections is what a web socket server should allow, so 200 should be easy.

Any and all help would be greatly appreciated!

@PhilipRoman
Copy link
Collaborator

Hi, 200 clients certainly should not be a problem from websocket point of view. From the given info I can suggest the following steps:

  • Make sure you are not doing long running operations in websocket callbacks (especially onConnect).
  • Run the bare mininum websocket server which does nothing except count connections and connect a few hundred clients. You should not see any dropped connections as long as your network capacity is sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants