New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kms: Add semaphore to limit concurrency #3693
Conversation
generateCipher is memory heavy, so to avoid OOM situations, a semaphore is added to limit concurrency here. Signed-off-by: Lennart Jern <lennart.jern@est.tech>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good to have, it adds to the stability of ceph-csi when multiple volumes are created at the same time. Volume creation is not a critical part of a workflow when running applications, a slight delay while generating the cipher should be acceptable for increased stability.
@@ -47,6 +47,8 @@ require ( | |||
sigs.k8s.io/controller-runtime v0.14.4 | |||
) | |||
|
|||
require golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, this could be part of the above require
block
commitlint fails because |
@lentzi90 Is there any estimate how "heavy" the Because from the comment [1] of the related issue, the oom-killer was invoked by It seems to me that a solution could be to limit the number of NodeStageVolume concurrent calls. Looking at the code, I suppose it is in this loop [2], where we spawn all NodeStageVolume calls concurrently [3] and then just wait for them to complete. I think it could do it in configurable chunks, e.g. not more than 10 calls simultaneously. [1] #3472 (comment) ceph-csi/internal/rbd/rbd_healer.go Line 179 in 991c21f
[3] ceph-csi/internal/rbd/rbd_healer.go Line 219 in 991c21f
|
@trociny like I mentioned on the issue already, I think this is a different issue. I just pushed the code that I already had for the original issue (which seems to be fixed already). |
This pull request now has conflicts with the target branch. Could you please resolve conflicts and force push the corrected changes? 🙏 |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
This pull request has been automatically closed due to inactivity. Please re-open if these changes are still required. |
generateCipher is memory heavy, so to avoid OOM situations, a semaphore is added to limit concurrency here.
Fixes: #3472
NOTE: I think the original issue is already resolved (see discussion in the issue). I pushed this anyway because I had already done the work and hoping that it can be useful in other situations or just for discussion. Feel free to close!
Show available bot commands
These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:
/retest ci/centos/<job-name>
: retest the<job-name>
after unrelatedfailure (please report the failure too!)
/retest all
: run this in case the CentOS CI failed to start/report any testprogress or results