Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - TTL / auto deletions and reloading #172

Open
unibum opened this issue Dec 27, 2020 · 1 comment
Open

Feature Request - TTL / auto deletions and reloading #172

unibum opened this issue Dec 27, 2020 · 1 comment

Comments

@unibum
Copy link

unibum commented Dec 27, 2020

Hi,

Thinking of 3 scenario's which could benefit from a Time to live type solution. Apologies in advance is terminology isn't correct but hopefully you get the intent.

Scenario 1: Something as simple as needing Noria to delete out anything not accessed in last x (configurable) days.

Scenario 2: Bit smarter, and when Noria reaches x% of resources e.g. has 5% or 1Gb free Memory remaining, starting deleting out old un-accessed data e.g. delete chunks of 100 (configurable) records at a time. Simply trying to ensure Noria doesnt run out of resources that would result in performance penalties as it creates/swaps between maps etc.

Scenario 3: Bit harder to explain. Follow the sun type load/drop/load solution. E.g. e-Commerce platform operating in multiple countries around the world. At midnight when Country A sleeping/low load drop data in Noria, and pre-load with new data from another Country B that is about to come online. I have seen discussions about loading from DB directly (Noria's RocksDB) as well as syncing data from another DB to Norias RocksDB or reading from other DB directly. Trick is, how to identify and re-load any newly listed products for sale during the e.g. 12 hours from Country A which we would want in Noria followed by the data dumped from Country A the day before in a most recently accessed first in priority, should Noria run out of resources (memory) then the oldest accessed data gets loaded last or not at all if memory is all consumed.

But to extend this scenario further, it is more realistic that several countries are online at once. So I am really wanting a way to configure each country or a generic single time (e.g. midnight for that country) data should be dropped and not a TTL of e.g. 18 hours for last accessed. I.e. User can add a new product for sale at 9pm so wouldn't want that to hang around for 18 hours in Noria when that country is asleep from midnight / 3 hours from when created. So as a guess, that would mean some way of tagging a product as being from country X, Y or Z (or 'worldwide' tag if its a global available or popular product that never gets deleted) and a backend scheduler job looking at config for IF and when to drop the data and a time as to when to reload it back the next day.

@jonhoo
Copy link
Contributor

jonhoo commented Dec 27, 2020

I agree that would be neat! Noria's current eviction model is simply to evict randomly from the largest views when memory approaches some threshold, but in a more production-oriented setting you'd want some more smarts like the above.

Now, implementing a TTL specifically is a little tricky since you'd need something to regularly check all the TTLs for all the entries in all the views, but in theory it's doable. I'm no longer actively working on Noria now that I've graduated, but maybe someone will pick this up and implement it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants