New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sweep mechanism can stop working abruptly #1627
Comments
CometD 5 is at End of Community Support (#1179). Issue #960 was unexplicable, as apparently an exception slipped out of a Please take a JVM thread dump if you can reproduce the issue. I am not aware of reasons for which the async sweep would stop running. In case of any exception the sweep is re-scheduled, provided that Let us know if you have more details. |
Thanks for the reply. Is there anything from the heap dump that would help understand what could have happened with the sweeper? |
Please post the thread dump. Also, would be useful if you can take a Are you using the HTTP transport or WebSocket? |
We are on WebSockets. The server has been restarted since and doesn't exhibit same behavior. Will see if I get the BayeuxServer dump if it happens next time. Below is sample thread dump
|
If it happens again, please perform the Also, consider doing what was done in #960:
Let us know how it goes. |
@nagarjun-reddy we have just fixed #1716, which likely this issue duplicates. Please upgrade to the latest CometD version, and report back if the issue has been fixed. |
Thank you @sbordet. It would take sometime for us to upgrade, can this be cherry picked on top of 5.0.14 or are there any other dependencies? Also I think, the symptoms mentioned in this issue #1132 seem related to this fix. Would this also fix this without needing to configure maxProcessing parameter? |
CometD version(s)
5.0.14
Java version & vendor
openjdk version "11.0.21"
Description
We have encountered a scenario where the sweep has stopped working and didn’t remove any sessions until the application was restarted. There are no logs suggesting what could have happened to sweep and why it might not have rescheduled to run. This looks similar to the bug that was filed before #960 when it was using non asynchronous sweep. We are on 5.0.14 and wanted to check if the asyncsweep would need any similar exception handling or there could be any scenarios where this can happen and sweep exits without rescheduling.
Will add more details as we find.
The text was updated successfully, but these errors were encountered: