-
Notifications
You must be signed in to change notification settings - Fork 2.2k
feat(k8s): add error handling tests in K8s #3736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Latency summaryCurrent PR yields:
Breakdown
Backed by latency-tracking. Further commits will update this comment. |
Codecov Report
@@ Coverage Diff @@
## master #3736 +/- ##
==========================================
+ Coverage 86.03% 90.01% +3.98%
==========================================
Files 156 156
Lines 11984 11989 +5
==========================================
+ Hits 10310 10792 +482
+ Misses 1674 1197 -477
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
logger.debug(f'stop sending new requests after {i} requests') | ||
# allow some requests to complete | ||
await asyncio.sleep(10.0) | ||
os.kill(os.getpid(), signal.SIGKILL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the client.post()
will block forever waiting for responses which never come (those messages are lost which is the problem the test is showcasing).
Just killing the request process is the easiest thing to do to stop the test here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to get rid of this though, it seems to break the tests ocassionally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the kill to the parent process. Thats still not the most elegant solution, but I think it should be safe now.
@@ -55,13 +56,16 @@ def start_runtime(args, handle_mock, cancel_event): | |||
@pytest.mark.slow | |||
@pytest.mark.timeout(10) | |||
@pytest.mark.parametrize('close_method', ['TERMINATE', 'CANCEL']) | |||
def test_grpc_data_runtime_graceful_shutdown(close_method): | |||
@pytest.mark.asyncio | |||
@pytest.mark.skip('Graceful shutdown is not working at the moment') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. Why is not working this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the unit test which fails for the same reason as the K8s one, just without K8s. I am canceling and joining the runtime here and demonstrating that not all messages are received.
Before this change the test was not suffucient to catch this case correctly. The improved version of the test now fails as expected
This reverts commit 3c85e12.
9379d8f
to
5b8e9ed
Compare
|
This PR mostly add tests around graceful termination of Runtimes and K8s Pods. It also fixes a bug in the ConnectionPool implementation related to removing connections.
In detail the following things are done in this pr:
scaling
What is not added here, but should be done in the future:
Closes #3604