New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plotting|tests: Cache parsed plot memos in PlotManager
#8009
Conversation
45bfcbe
to
358551b
Compare
358551b
to
3b0b971
Compare
PlotManager
PlotManager
PlotManager
PlotManager
3b0b971
to
02e245d
Compare
PlotManager
PlotManager
02e245d
to
3d93a5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an expert at what's actually going on here, but the patterns of changes seem reasonable.
Instead of saving cache only on shutdown, how hard would it be to copy the cache to disk every time it changes? That would decrease start-up time after a crash or reboot (which may actually happen more often than orderly shutdowns, since users tend to keep these processes long-running). |
3d93a5c
to
46cf4f5
Compare
@richardkiss Good points, i moved the file into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not obvious to me that it makes sense to save the cache on a timer, regardless of whether a new plot was added or not. Also, ideally, timer logic like this ought to be isolated in its own class to make it easy to unit test (right now I don't think there are any tests that the regular cache-saving timer works as it's supposed to).
Rather than factoring it out and writing unit tests though, I think it could be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not obvious to me that it makes sense to save the cache on a timer, regardless of whether a new plot was added or not. Also, ideally, timer logic like this ought to be isolated in its own class to make it easy to unit test (right now I don't think there are any tests that the regular cache-saving timer works as it's supposed to).
There is at least a test that the mechanism itself works here, but true, not that it works repeatedly. I don't think we have a way to mock the time yet, at least i didn't find something last time i checked for it and i feel this makes test like this either inefficient or flaky.
Rather than factoring it out and writing unit tests though, I think it could be removed.
I'm not opposed to remove it if you really think its an issue.
Does this time interval check provide any utility? As far as I can tell, as the code looks right now (but I may be misreading it). it saves the cache if both of these conditions are met:
I would expect one problematic case to be when we load more than one batch worth of plot files, we only save the cache after the first batch, all remaining batches may not trigger the cache to be saved if they happen soon enough. You can lower the time interval to guarantee that it always takes longer to load a batch, but then you have defeated the purpose of having the timeout. Or you can hoist the check out to at the end of loading all batches, but then you also defeat the point of the timeout, since the refresh interval will make it always have "expired" by the time we look for more plot files. I would think the ideal behavior is to not save the cache at all until we're done with all batches. |
The idea of saving the cache based on the timer in the inner loop not only after it was to cache between the batches during long running plot loadings of large farms.
This is correct. The first condition is to avoid saving the cache for no reason, the second to at least save it now and then.
Yeah right i had this in mind but felt it will eventually trigger a cache save if at some point a plot flows in or the process gets shutdown and if it ever crashes without a cache save still nothing bad happens. To mitigate this edge case i could just add an:
Below the cache cleanup in the outer loop. Note that we check the |
But if the second condition returns |
Yeah right they would only be saved as soon as the next plot comes in or the shutdown gets triggered. How about the last two commits? |
That's a problem, right? I still don't understand why this feature need to exist. As far as I can tell it only causes issues without providing any utility. The best time to save the cache is at the end of a full refresh of plots, if we found new plot files, right? Even with your new patches this is a problem. |
This comment is misleading. As far as I can tell from inspecting the code, this configuration option is used to prevent saving the cache any more frequent than this. But, again, I don't see the value of having this to begin with. |
780fd3e
to
a85f0e8
Compare
Why? The cache stays changed, the time stays exceeded as long as we don't save the cache and the refresh will happen which will eventually trigger the write. Last push added the same statement below the inner loop to cover a potential edge case where it would be delayed for one refresh interval.
Like if your system takes a very long time to load all plots you can avoid reloading from scratch after a crash with this. I agree it sounds like this will not happen very frequently and it doesn't absolutely need to exist at the end. I just don't yet understand what issues/risk you see with this included. |
If we don't find any more plot files ever, it will still trigger at the end of the refresh because of this, that's what you mean, right? The problem is that you then have a whole refresh interval where parts of the cache stays unsaved. This happens every time you load more plot files than one batch-size, right? So, presumably quite often.
I would think that to be a lot more uncommon than the issue with not saving the last few batches. But most importantly, there's a lot of complexity that doesn't seem to carry its own weight. |
Either you or i misunderstand something here. The condition
Like i said above, i still don't understand where your "not saving the last few batches" understanding comes from and to be honest, two extra checks doesn't feel like "lot of complexity" to me. |
you're referring to this call to
You say "no matter what", but it's only saved if
Meaning that having added something to the cache isn't sufficient for this to return
Exactly, that's what I'm referring to as "the problem". It is a problem because it's suboptimal and caused by a bunch of complexity that I think we should just remove. consider the following scenario:
In this scenario, the cache from batch 2 won't be saved until the next refresh interval. Between step 4 and 5 you could also insert loading more batches, and all of them would end up in this unsaved state until the next refresh.
I'm not primarily referring to the two extra checks when I say "complexity". I'm primarily referring to the additional state of the timestamp, and the scope of that state. But also the emergent behavior of those two extra checks combined with the state. If my reading of the code (as I described above) is right, that's not obvious from reading the code. Even the state used to record that something was added or removed from the cache could be local variables in |
this is the code I think should be removed:
|
chia/plotting/manager.py
Outdated
def load_cache(self): | ||
try: | ||
self.cache.load_from(self.cache_path()) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should catch FileNotFoundError
here, and not print anything to the log. Otherwuse, the first time you start you'll get this in the log:
09:23:15 chia.plotting.manager: ERROR Failed to load cache: [Errno 2] No such file or directory: '/private/var/folders/s_/_vk8b8gj6x5_d1p2vpr5r5gh0000gn/T/tmpq5lc9wup/cache/plot_manager.dat', Traceback (most recent call last):
File "/Users/arvid/Documents/dev/chia-blockchain/chia/plotting/manager.py", line 170, in load_cache
self.cache.load_from(self.cache_path())
File "/Users/arvid/Documents/dev/chia-blockchain/chia/plotting/manager.py", line 87, in load_from
serialized = path.read_bytes()
File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1249, in read_bytes
with self.open(mode='rb') as f:
File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1242, in open
return io.open(self, mode, buffering, encoding, errors, newline,
File "/opt/homebrew/Cellar/python@3.9/3.9.5/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pathlib.py", line 1110, in _opener
return self._accessor.open(self, flags, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/private/var/folders/s_/_vk8b8gj6x5_d1p2vpr5r5gh0000gn/T/tmpq5lc9wup/cache/plot_manager.dat'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one, added!
a85f0e8
to
84693fb
Compare
@arvidn Okay i see your points, i still think its better to at least save here and than between the batches than just only save at the end but i agree thats probably not required in many cases and the logic wasn't super consistent. We actually could have just change the outer check to:
and leave the rest so that we would not have missed out any batches but still saved between if required but i just dropped it now 😄 I squashed everything and moved some more logic related to saving from |
84693fb
to
4dbf7ef
Compare
In a future patch: to address the issue of loading plots taking so long that we may want to save the cache intermittently, I think a better approach would be to have a local counter in that loop that triggers a save every X batch. It could be based on time too, although I would think every X batch would be good enough and simpler. Either case, it could still be a local variable for that loop. |
data: List[Tuple[bytes32, CacheEntry]] | ||
|
||
|
||
class Cache: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have an opportunity now to unit test this class
Will this change be reflected in 1.2.6 or a later patch? |
4dbf7ef
to
eb8a244
Compare
Rebased on
Most likely 1.2.6. |
This introduces plot memo caching in
PlotManager
which means the harvester saves the parsed plot memo on disk on shutdown and loads that back into memory on startup so that it can skip key parsing/calculations for all already known plots.On my test system with ~1700 plots i get the following times:
With #8275 applied: