Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: portforward connection cleanup #973

Merged
merged 6 commits into from Sep 7, 2022

Conversation

tiagolobocastro
Copy link
Contributor

@tiagolobocastro tiagolobocastro commented Aug 3, 2022

In some situations it was observed that connections lingered in
a CLOSE_WAIT state or even ESTABLISHED state.
In others, an existing connection would be reused and would fail at the time of use.

When the reader half is closed, signal it to the forwarder_loop task.
When a websocket close message is received, signal it to the forwarder_loop task
so it may shutdown all write half's.
Close the non-taken streams when joining so we may terminate.

Signed-off-by: Tiago Castro tiagolobocastro@gmail.com

Motivation

In some situations it was observed that connections lingered in
a CLOSE_WAIT state or even ESTABLISHED state.
In others, an existing connection would be reused and would fail at the time of use.

Solution

When the reader half is closed, signal it to the forwarder_loop task.
When a websocket close message is received, signal it to the forwarder_loop task
so it may shutdown all write half's.
Close the non-taken streams when joining so we may terminate.

When a message is received indicating a closed socket, shutdown all write half's.

@codecov-commenter
Copy link

codecov-commenter commented Aug 3, 2022

Codecov Report

Merging #973 (3be996b) into master (57d97ce) will decrease coverage by 0.46%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##           master     #973      +/-   ##
==========================================
- Coverage   72.58%   72.12%   -0.47%     
==========================================
  Files          64       64              
  Lines        4505     4534      +29     
==========================================
  Hits         3270     3270              
- Misses       1235     1264      +29     
Impacted Files Coverage Δ
kube-client/src/api/portforward.rs 0.00% <0.00%> (ø)
kube-core/src/resource.rs 44.18% <0.00%> (-2.33%) ⬇️
kube-runtime/src/utils/mod.rs 58.62% <0.00%> (ø)
kube/src/lib.rs 87.59% <0.00%> (+0.18%) ⬆️

@kazk
Copy link
Member

kazk commented Aug 3, 2022

I think we should stop forwarder_loop:

  1. When receiving close WebSocket message from server
    • from_pod_loop should notify forwarder_loop and stop
    • forwarder_loop can stop (break) when this happens
  2. When all of to_pod_loop are done
    • each to_pod_loop should notify forwarder_loop and stop
    • forwarder_loop should keep track of the active ones. When all of them are done, send close WebSocket message to the server, then stop

I don't think it's necessary to call shutdown, but I'm not confident. I thought dropping them was enough.

@tiagolobocastro tiagolobocastro force-pushed the portforward-fix branch 4 times, most recently from 04818a3 to 67d7821 Compare August 4, 2022 10:55
@tiagolobocastro tiagolobocastro marked this pull request as ready for review August 4, 2022 10:56
@tiagolobocastro tiagolobocastro force-pushed the portforward-fix branch 2 times, most recently from 70c8b8c to 491bbf2 Compare August 4, 2022 11:32
@tiagolobocastro
Copy link
Contributor Author

tiagolobocastro commented Aug 4, 2022

I think we should stop forwarder_loop:

1. When receiving `close` WebSocket message from server
   
   * `from_pod_loop` should notify `forwarder_loop` and stop
   * `forwarder_loop` can stop (`break`) when this happens

2. When all of `to_pod_loop` are done
   
   * each `to_pod_loop` should notify `forwarder_loop` and stop
   * `forwarder_loop` should keep track of the active ones. When all of them are done, send `close` WebSocket message to the server, then stop

I don't think it's necessary to call shutdown, but I'm not confident. I thought dropping them was enough.

Ok I think I'm there now! Would you mind taking another look?

In some situations it was observed that connections lingered in
a CLOSE_WAIT state or even ESTABLISHED state.
In others, an existing connection would be reused and would fail at the time of use.

When the reader half is closed, signal it to the forwarder_loop task.
When a websocket close message is received, signal it to the forwarder_loop task
so it may shutdown all write half's.
Close the non-taken streams when joining so we may terminate.

Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com>
@nightkr
Copy link
Member

nightkr commented Aug 4, 2022

Just a minor note (especially since this PR is fairly small), but please don't force-push to a PR that is already being reviewed. It makes it much harder to keep track of review progress.

bors bot pushed a commit to openebs/mayastor-control-plane that referenced this pull request Aug 8, 2022
288: feat(k8s): add proxy library for http and tcp r=tiagolobocastro a=tiagolobocastro

feat(k8s): add proxy library with portforward

Add new k8s proxy library which is capable of setting up port forwarding to a port.
A server is bound to a local port and each connection creates a new port forward to the
configured pod target.
Based on the kube-rs port bind example :+1

Found a few bugs with the port-forward and raised a PR to attempt fixing them: kube-rs/kube#973

Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com>



Co-authored-by: Tiago Castro <tiagolobocastro@gmail.com>
@tiagolobocastro
Copy link
Contributor Author

Hey @kazk and @teozkr, did you have any other comments on this? It's fine if you're busy, just in case you forgot :D Thank you

@nightkr
Copy link
Member

nightkr commented Aug 17, 2022

Hey, sorry. I'm honestly not quite comfortable reviewing this without a better way to test what's going on. I was working on that but did, indeed, end up forgetting about the PR. Sorry... :/

@tiagolobocastro
Copy link
Contributor Author

No problem at all! Was just checking :)
I started changing the examples to test this, strangely one of the issues I see with my application I don't see with the nginx deployed in the examples, seems the connection doesn't get terminated after 6s and thus the second request after 10s which fails with my application doesn't fail with nginx, odd...

Anyway I can reproduce the half-closed connections:
https://gist.github.com/tiagolobocastro/4890040c9fc1f5bb526e5de5a6d0cdea
(which took me a long time because I was testing it on my branch where it's fixed... :)

Copy link
Member

@nightkr nightkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, finally got back to this (sorry!), and can confirm both the issue and that the PR fixes the issue.

A few nits, but happy to merge after that!

kube-client/src/api/portforward.rs Outdated Show resolved Hide resolved
kube-client/src/api/portforward.rs Outdated Show resolved Hide resolved
@nightkr
Copy link
Member

nightkr commented Aug 25, 2022

@tiagolobocastro
Copy link
Contributor Author

I uploaded my repro to https://gitlab.com/teozkr/repros/-/tree/kube/portfwd-halfclose

Awesome thank you!
(I see a shell.nix, good to see other nix users :D )

@nightkr
Copy link
Member

nightkr commented Aug 26, 2022

There are dozens of us!

Use a ChannelState with initialized and shutdown flags rather than 2 vecs.
Drop the errors when joining the portforward task.
Break on SocketClose to make it clearer.

Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com>
@tiagolobocastro
Copy link
Contributor Author

I've address the comments, please take a look - thank you.

Make use of the channel index to reference the shutdown port index to make it consistent
with the initialized check.
This is because even channel indexes are for data and odd indexes are for errors.

Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com>
Use a single channel reference to avoid indexing into the vector everytime.

Signed-off-by: Tiago Castro <tiagolobocastro@gmail.com>
Copy link
Member

@nightkr nightkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, waiting for @kazk before merging since they also seemed to have opinions

Copy link
Member

@kazk kazk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

@nightkr nightkr merged commit 95397a5 into kube-rs:master Sep 7, 2022
@nightkr
Copy link
Member

nightkr commented Sep 7, 2022

Thanks!

@clux clux added this to the 0.75.0 milestone Sep 8, 2022
@clux clux added the changelog-fix changelog fix category for prs label Sep 8, 2022
@clux clux changed the title fix: portforward connections fix: portforward connection cleanup Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-fix changelog fix category for prs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants