Fully non-blocking distributed bucket #326
-
Hello, @vladimir-bukhtoyarov! I like the library a lot, however I can't figure out whether a few scenarios can be achieved with current features. If not, I would be grateful for any guidance to how I should implement these on top of existing source code without getting intimidated. I have been battling these for weeks, and I really tried my best before posting this. Q1: I have a distributed system and I have chosen Ignite as my cache cluster (I suppose not relevant, could be any other). There are no sticky-sessions, and the load isn't always evenly distributed This is what I want the workflow to be like:
I have found Optimization.delaying, which does almost exactly that. However, if I understand it correctly, there is a huge difference - it returns a CompletableFuture, and if the bucket sees that it is time to synchronize, the future won't be completed for the entire synchronization time. I cannot afford these kinds of delays, and I want the actual cluster update to be issued behind the scenes (and the local bucket be updated as it completes), but the local value to be always instantly returned from tryConsume(). I wanted to try something like this:
However, I believe that CompletableFuture cannot be completed multiple times, and if the future actually completes on timeout, then the synchronization part (that arrives later) may be processed incorrectly. Moreover, this solution implies that I always return true if the cache cluster is unreachable, which is way worse than continuing to serve locally. Note that if the bucket issues such a behind-the-scene-update at point T1, and it completes at T2, then all local changes between T1 and T2 should be included in the update. I really want these strong time guarantees, and I would appreciate any advice. Q2: Suppose I have 3 nodes with delaying optimization, and a distributed bucket with capacity 10. All 3 get 5 requests in quick succession, successfully serve them locally (tryConsume(1), for example), and then begin to sync. Will the final value of tokens be 0 or -5 (negative 5)? The latter is sometimes desirable for me, this would require all the tryConsume-kind of request to be sent for synchronization as force-consume. Is this doable with simple source-code modification? Alternatively, maybe I'm solving the wrong problem. What really concerns me, is a following scenario:
A workaround that looks decent is to refill such buckets rarely and instantly to full capacity. That way, if the sync works decently-fast, the attacker wouldn't be able to spend all 1000 tokens before the sync arrives and cuts him off. I believe that the solution I described in the beginning would be perfect, and I would appreciate both an advice to implement it, or an alternative to help with the underlying problem. Q3: The documentation states that AsyncBucketProxy is not a cheap object when created with optimizations (quite obviously), therefore I cache them for each remote bucket. However, sometimes I will have to update the configuration on my server nodes, not necessarily exactly at the same time. What happens if two nodes hold AsyncBucketProxy with different configurations and try to synchronize them via the cache cluster? I know there can be certain merging strategies for distributed caches, but maybe the library already takes care of it nicely and I don't have to overthink. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
@muldrik hello, A1:
Currently there are no optimizations that allows to operate fully asynchronously. All currently implemented optimizations are trying to do remote requests lesser, but when conditions obey that sync need to be done, the sync always performed in scope of the user thread.
Optimization is the interface, feel free to implement own. I would recomend to start from https://github.com/bucket4j/bucket4j/blob/master/bucket4j-core/src/main/java/io/github/bucket4j/distributed/proxy/optimization/delay/DelayOptimization.java, it already inheret from BatchingOptimization so you have guarantee that your code will be executed from one thread. I can create mock-up for you, all that you will need - implement background call in some executor(and testing), it should not take more then 1 hour from me. A2:
Yes, available tokens will became negative on all client nodes(as well as on server nodes) after synchronization. Negative amount of available tokens is normal case for Bucket4j math model, moreover I can say it is killer feature, it was initially introduced to support BlockingBucket in order to protect parked thread from situations when another threads stole requested tokens while it was parked in waiting for refill deficit. A3:
It should not be created on per requests basis, because optimizations do gropping of requests on particular bucket instance. Really optimized bucket is not so costly, I suppose around 200 bytes, you can estimate it by investigation of heapdump, but as described above create->call->forget it is useless strategy for optimized buckets.
Bucket4j use unique architecture solution for configuration conflict resolving. Mostly libraries stores configuration on client nodes when state of bucket is stored in the storage, this way can lead to unresolvable problems when client configuration is not compatible with persisted state. In opposite to mainstream way, Bucket4j stores the both state of bucket and configuration in the storage together, so single call
P.S about mockup based on DelayOptimization, I suppose that I will be able to provide it tomorrow. |
Beta Was this translation helpful? Give feedback.
@muldrik hello,
A1:
Currently there are no optimizations that allows to operate fully asynchronously. All currently implemented optimizations are trying to do remote requests lesser, but when conditions obey that sync need to be done, the sync always per…