Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization Potential: Live Calculation of Available Storage for allocation of Pods #873

Open
jakobmoellerdev opened this issue Apr 4, 2024 · 8 comments
Labels

Comments

@jakobmoellerdev
Copy link
Contributor

What should the feature do:

Currently TopoLVM is using the Node annotation it sets on every node it uses to report an available capacity. While this is a great feature, TopoLVM also uses this capacity annotation to determine how much storage is available on a Node before deciding wether a Volume can be allocated.

I would like to propose adding a new RPC call that can be used to fetch each nodes capacity from the topolvm daemonset instead of relying on the Node object in Kubernetes.

What is use case behind this feature:

The updating of the available capacity annotation is not always up to date so it might not reflect the actual capacity on the node if the underlying volume group has changed between the annotation update interval. This method would make sure that TopoLVM would always use the latest available capacity instead of using the latest updated value from the annotation. This would allow faster reconciliation timings of PVCs when working in environments where the underlying volume group capacity is changed.

@pluser
Copy link
Contributor

pluser commented Apr 19, 2024

@jakobmoellerdev, Sorry for the late reply.

I don't know if the method you are suggesting will completely solve the problem. There may be a delay in reflecting the capacity in the Node object, but it is also possible that there is a delay in reflecting the capacity if there is a request between the creation of the PV and the logicalvolume.

Also, in the past our team has encountered scalability issues with the lvm command. So instead of querying lvmd, how about monitoring the ioctl interface with epoll or similar, or using uevents?

@jakobmoellerdev
Copy link
Contributor Author

There may be a delay in reflecting the capacity in the Node object, but it is also possible that there is a delay in reflecting the capacity if there is a request between the creation of the PV and the logicalvolume.

In our practical application we have noticed that the most common issue is the capacity annotation not being updated fast enough. Im not sure what you mean with a delay due to a request between PV creation and lv creation. Would you elaborate the scenario a bit more?

Also, in the past our team has encountered scalability issues with the lvm command. So instead of querying lvmd, how about monitoring the ioctl interface with epoll or similar, or using uevents?

I would like to understand and potentially address these scalability problems. LVM is written in a way where there should not be any issues with the command if it is used properly. Also, TopoLVM doesn't use it but LVM can be run continously in one process to avoid fork overhead.

@pluser
Copy link
Contributor

pluser commented Apr 22, 2024

In addition to delays in updating Node annotations, there are several other situations where problems can occur. One of them mentioned PV and logical volume delays.

If I agree with your suggestion, TopoLVM will behave as follows:

  1. lvmd creates LVM logical volume
  2. create a PersistentVolume resource corresponding to the above LVM logical volume
  3. update the annotation representing the capacity of the node at each provisioning

Processes 1-3 may operate concurrently. If multiple requests (A, B) come in at the same time, request B will probably execute step 1 before request A executes step 3. The greater the number of requests, the more likely this phenomenon will occur. Therefore, while the method you suggest may have some effect, it may not be a fundamental solution.

More to the point, regarding the scalability of LVM commands, we have had problems in the past with lockups when issuing a large number of commands in a short period of time. This phenomenon may have been resolved now, but it is safer not to increase the frequency of command execution too much.

@jakobmoellerdev
Copy link
Contributor Author

Processes 1-3 may operate concurrently. If multiple requests (A, B) come in at the same time, request B will probably execute step 1 before request A executes step 3.

I do not see an issue with running this command concurrently because the Node server is locked with a mutex and thus will only ever handle one request at a time on a node. I may be missing something here though.

More to the point, regarding the scalability of LVM commands, we have had problems in the past with lockups when issuing a large number of commands in a short period of time.

This is the first time I have heard this happening. I think this is more likely to be a sideffect of aforementioned mutex and not using lvm in interactive mode, but without a detailed trace its hard to determine. By default though I would argue that lvm probably isnt the problem.

@pluser
Copy link
Contributor

pluser commented Apr 30, 2024

Sorry, but there will be a delay in reply for about a week. I don't believe this is an urgent matter, but if it is, please let me know.

@pluser
Copy link
Contributor

pluser commented May 10, 2024

The current implementation seems to annotate the VG capacity to the Node once every 10 minutes, so just calling topolvm-node every time an LV is provisioned and updating the annotations on the Node might be effective enough. What do you think?

@jakobmoellerdev
Copy link
Contributor Author

I think that is sufficient for most purposes but Im not sure how you would want to implement that, could you give me a rough outline so I understand better?

@pluser
Copy link
Contributor

pluser commented May 27, 2024

Sorry for the late reply. I need to check a few things, so please wait a bit longer for a reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In progress
Development

No branches or pull requests

2 participants