-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization Potential: Live Calculation of Available Storage for allocation of Pods #873
Comments
@jakobmoellerdev, Sorry for the late reply. I don't know if the method you are suggesting will completely solve the problem. There may be a delay in reflecting the capacity in the Node object, but it is also possible that there is a delay in reflecting the capacity if there is a request between the creation of the PV and the logicalvolume. Also, in the past our team has encountered scalability issues with the lvm command. So instead of querying lvmd, how about monitoring the ioctl interface with epoll or similar, or using uevents? |
In our practical application we have noticed that the most common issue is the capacity annotation not being updated fast enough. Im not sure what you mean with a delay due to a request between PV creation and lv creation. Would you elaborate the scenario a bit more?
I would like to understand and potentially address these scalability problems. LVM is written in a way where there should not be any issues with the command if it is used properly. Also, TopoLVM doesn't use it but LVM can be run continously in one process to avoid fork overhead. |
In addition to delays in updating Node annotations, there are several other situations where problems can occur. One of them mentioned PV and logical volume delays. If I agree with your suggestion, TopoLVM will behave as follows:
Processes 1-3 may operate concurrently. If multiple requests (A, B) come in at the same time, request B will probably execute step 1 before request A executes step 3. The greater the number of requests, the more likely this phenomenon will occur. Therefore, while the method you suggest may have some effect, it may not be a fundamental solution. More to the point, regarding the scalability of LVM commands, we have had problems in the past with lockups when issuing a large number of commands in a short period of time. This phenomenon may have been resolved now, but it is safer not to increase the frequency of command execution too much. |
I do not see an issue with running this command concurrently because the Node server is locked with a mutex and thus will only ever handle one request at a time on a node. I may be missing something here though.
This is the first time I have heard this happening. I think this is more likely to be a sideffect of aforementioned mutex and not using lvm in interactive mode, but without a detailed trace its hard to determine. By default though I would argue that lvm probably isnt the problem. |
Sorry, but there will be a delay in reply for about a week. I don't believe this is an urgent matter, but if it is, please let me know. |
The current implementation seems to annotate the VG capacity to the Node once every 10 minutes, so just calling topolvm-node every time an LV is provisioned and updating the annotations on the Node might be effective enough. What do you think? |
I think that is sufficient for most purposes but Im not sure how you would want to implement that, could you give me a rough outline so I understand better? |
Sorry for the late reply. I need to check a few things, so please wait a bit longer for a reply. |
What should the feature do:
Currently TopoLVM is using the Node annotation it sets on every node it uses to report an available capacity. While this is a great feature, TopoLVM also uses this capacity annotation to determine how much storage is available on a Node before deciding wether a Volume can be allocated.
I would like to propose adding a new RPC call that can be used to fetch each nodes capacity from the topolvm daemonset instead of relying on the Node object in Kubernetes.
What is use case behind this feature:
The updating of the available capacity annotation is not always up to date so it might not reflect the actual capacity on the node if the underlying volume group has changed between the annotation update interval. This method would make sure that TopoLVM would always use the latest available capacity instead of using the latest updated value from the annotation. This would allow faster reconciliation timings of PVCs when working in environments where the underlying volume group capacity is changed.
The text was updated successfully, but these errors were encountered: