Skip to content
This repository has been archived by the owner on Mar 31, 2024. It is now read-only.

Various points of feedback #42

Open
francislavoie opened this issue Jan 14, 2023 · 3 comments
Open

Various points of feedback #42

francislavoie opened this issue Jan 14, 2023 · 3 comments

Comments

@francislavoie
Copy link

So I was just doing some thinking about how the storage is designed here, and I have a few points of feedback.

  • I've been playing with RedisInsight lately and it has an option for rendering a value as JSON if the value is valid JSON. With the default value_prefix currently of caddy-storage-redis, that breaks that functionality because it just straight up adds text to the front of the JSON, making it invalid unless stripped.

    What's the reasoning behind value_prefix? Why would the value need a prefix? Scanning is done by key, so that shouldn't be relevant, I would think.

    Unfortunately, simply changing this default would be a breaking change of course because then everyone's data wouldn't be read correctly anymore. But the lib could be changed to continue stripping the prefix if it exists, but start storing values without a prefix.

  • Currently, SCAN is used to find all keys by pattern. That's quite efficient compared to something like KEYS, but the problem arises when the same Redis DB is used for a lot of other data. That means scanning the entire key-set becomes much more expensive. For example, my DB currently has around 4 million keys in it (99.5% of it is not Caddy TLS data). Scanning the entire thing can take up to a few seconds.

    I realize that choosing a different DB number would be a solution for this. But migrating from one DB to another is not viable right now because we don't have any export/import tooling yet to do this by hand.

    One thing that would improve throughput is increasing the SCAN count from 100 to something like 1000 or 10000 which could crunch through that many much faster.

    Even better though would be to keep a SET, HSET or ZSET of all the keys managed by this plugin, so then you can do an SSCAN, HSCAN or ZSCAN to only look at the relevant entries and avoid looking at any data that isn't managed by the plugin via SCAN.

    Using a SET would just be a listing of all the keys managed by the lib, to be scanned with SSCAN instead. Simple solution to faster scanning.

    Using a ZSET would be the same idea as a SET, but it would also mean that the score could be used to store the modification time of the key. That could be useful for purging data that's too old for example with ZRANGEBYSCORE or something like that. It's useful for implementing a custom expiry mechanism. Not sure if that's actually beneficial right now though.

    Using an HSET would mean that the actual values could be stored in the hash set directly instead of in individual keys. HSCAN would still allow finding keys easily by pattern. This would probably be a decent simplification of the storage as well, because the key prefix would only apply to a single key instead of every key used by the lib, and then the keys inside the HSET would not need to be prefixed.

    Migrating to any of these storage patterns would require a bit of code to migrate it on upgrade of course. This could be pretty simple; at startup, check if the chosen set type's key exists (which is a O(1) lookup). If it doesn't, do a SCAN to find all the keys, move them into the set (and if HSET is chosen, delete all the existing storage keys).

@gamalan
Copy link
Owner

gamalan commented Jan 17, 2023

@francislavoie Thanks for the feedback. I know there are some bad design, because at first I just do it because I need some storage which available for my team to use.

@francislavoie
Copy link
Author

Oh no worries at all.

I'm bringing this up because I've been more heavily working with Redis with an app of mine and I decided to review how this plugin does this with that fresh perspective.

I'm thinking of making the changes I suggest in a PR if you're open to it, wanted to see if you have any opinions on those points before I get started on it.

@gamalan
Copy link
Owner

gamalan commented Jan 21, 2023 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants