New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python-bindings|prover_disk|tests: Add pickle
support for DiskProver
#303
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new constructor should validate its inputs, and have tests ensuring it rejects invalid inputs correctly. (I.e. we shouldn't trust the pickled file to be uncorrupted or trustworthy)
fc8b780
to
df9990d
Compare
I think it's a bit weird to lock the mutex in the move constructor, just like in the destructor. Another thread better not have a reference to the object as it's about to be destructed. If it does, locking the murex won't help. Some of the getters seems to copy the vectors a bit gratuitously as well. |
My comment in the G1 pickle support applies here too. Can we rely on pickle being a stable format across python versions? We need to ensure we don't accidentally break backwards compatibility. |
Hmm better not yeah but how to prevent it? If we don't lock it in the move constructor we could end up moving the object while another thread has the mutex locked i.e. its still fetching a proof or so (relies on the data we are about to move), no?
The soon coming change in
which contains the |
My point is that locking the mutex won't help in that case. We have UB either way. So, we could assert that the mutex isn't locked maybe, but fundamentally, I think we just have to assume the users of this class aren't doing shady stuff with it.
If the pickle file format changes, you won't ever get to see the version number. Using our Streamable format may be an alternative, but that would create a dependency on the chia-blockchain repo which would be a hassle. |
I mean, technically there is still a possible case where its no UB if we lock the mutex like if we have two threads and we trigger stop for both, one of them has currently the mutex locked and the other wants to move the object to a shutdown thread or whatever idea one might have, right? But if we don't lock the mutex this case would be an UB. Not that this is super likely but its there and a reason to just lock it unless i miss something.
You can still always use older protocols if there will be a non-backwards compatible protocol bump at some point.
Thats what i started with... but not to add |
df9990d
to
44494f8
Compare
This pull request introduces 1 alert when merging 44494f8 into 97aa105 - view on LGTM.com new alerts:
|
44494f8
to
039e3b6
Compare
be439c7
to
6c3d45e
Compare
I think the Generally, classes should avoid using 2-phase initialization. If a constructor fails to establish the class invariant, it should fail. That way, all instances of a class are known good, no half-initialized objects. It seems the reason why you want the The modern alternative to 2-phase initialization is to use exposing |
8f3bee4
to
f01ba97
Compare
@arvidn I restructured the commit history a bit and put some stuff into separate commits.
Unfortunately it seems like its not possible to use
I refactored it, see b395860. Note that the alternative constructor still throws the same way like it did before but the new |
and a `is_valid` binding. Allows to validate if the prover's file exists and matches expectations.
f01ba97
to
016810c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me how or why one would use the Validate()
or is_valid()
functions. It would seem simpler to just attempt to look up a proof and have it fail. It's fine to defer the failure until we actually need to do the lookup, right?
src/prover_disk.hpp
Outdated
@@ -47,74 +47,52 @@ struct plot_header { | |||
// The DiskProver, given a correctly formatted plot file, can efficiently generate valid proofs | |||
// of space, for a given challenge. | |||
class DiskProver { | |||
const size_t nTableBeginPointerCount{11}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to add an immutable member like this. This makes DiskProver
objects not movable.
Did you mean to make this static
?
Also, I don't think we should use hungarian notation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes
DiskProver
objects not movable.
What do you mean? Maybe not by default but its still movable 🤷♂️ However, i agree, it makes sense to have it static
, see aa66d3a.
Also, I don't think we should use hungarian notation.
I dropped it in 96f0445 to stay consistent with the codebase, but whats your point here? I'm just curious because i'm used to always use it for primitives or standard container types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there is widespread agreement that encoding type information in variable names is undesirable, as documented in the cpp core guidelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Maybe not by default but its still movable
Right, I noticed later that you implement the move constructor explicitly, and just don't touch this member.
src/prover_disk.hpp
Outdated
for (uint32_t i = 0; i < c2_entries - 1; i++) { | ||
SafeRead(disk_file, c2_buf, c2_size); | ||
this->C2.push_back(Bits(c2_buf, c2_size, c2_size * 8).Slice(0, k).GetValue()); | ||
uint32_t nExpectedSize = ((this->table_begin_pointers[10] - this->table_begin_pointers[9]) / (Util::ByteAlign(k) / 8)) - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 94cad1a
} else { | ||
throw std::invalid_argument("Invalid plot file format"); | ||
if (k < kMinPlotSize || k > kMaxPlotSize) { | ||
throw std::invalid_argument("Invalid k: " + std::to_string(k)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment above this constructor suggests the inputs are not validated, but the seem to be, at least to some extent. Are there some aspects left that aren't validated, or is the comment outdated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not say inputs are not validated. It says the inputs are not validated against the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
src/prover_disk.hpp
Outdated
std::vector<uint64_t> table_begin_pointers; | ||
std::vector<uint64_t> C2; | ||
|
||
void Read(std::vector<uint8_t>& id_in, std::vector<uint8_t>& memo_in, uint8_t& k_in, std::vector<uint64_t>& table_begin_pointers_in, std::vector<uint64_t>& C2_in) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the names of these parameters confuse me. The _in
suffix suggests they are inparameters, but they all appear to be output parameters. Did you consider returning them in a tuple, as multiple return values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like a tuple would make this more confusing than better. See a9ca083.
src/prover_disk.hpp
Outdated
SafeRead(disk_file, memo_in.data(), memo_in.size()); | ||
|
||
table_begin_pointers_in = std::vector<uint64_t>(nTableBeginPointerCount, 0); | ||
C2_in = std::vector<uint64_t>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only benefit I can think of, of passing these vectors in as out-parameters is that they can reuse memory they may already have allocated. However, when you assign them like this, rather than .resize(0)
, you defeat that on benefit.
I would think, though, that these allocations are cheap anyway, and that it would make sense to return them instead of taking them as out-parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use the resize
approach here because i wanted to save one line 😄 (setting the contents or at least the first element of table_begin_pointers_out
to zero) and it obviously doesn't really matter given the size of this vectors.
std::vector<uint64_t> table_begin_pointers; | ||
std::vector<uint64_t> C2; | ||
|
||
void Read(std::vector<uint8_t>& id_in, std::vector<uint8_t>& memo_in, uint8_t& k_in, std::vector<uint64_t>& table_begin_pointers_in, std::vector<uint64_t>& C2_in) const | ||
{ | ||
std::lock_guard<std::mutex> lock(_mtx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the mutex didn't used to be locked when reading this, as far as I can tell. This is only called from the filename constructor, right? I see it's also called from Validate()
. It looks like this mutex is unneccessary though, but we can leave that for a future patch.
C2_in.push_back(Bits(c2_buf, c2_size, c2_size * 8).Slice(0, k).GetValue()); | ||
} | ||
|
||
delete[] c2_buf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this leaks if SafeRead()
throws an exception. This was an issue before your patch, I just want to make a note to remember to fix it later
I'm using it to drop invalid provers after cache loading, see Chia-Network/chia-blockchain@3b0b971#diff-1c950ad1eb2a79b51cb4f61cf39b8fe8d09d28ad963ce7dd24107053ff316d14R116 |
It would seem to me that doing that would defeat the purpose of caching these entries in the first place. If we're reading it back from the files unconditionally, to validate it, couldn't we just do that to begin with, and not also read the pickled cache? Perhaps I'm missing something |
is this PR abandoned now? |
Yes! |
This is needed for plots caching in
chia-blockchain
.Based on #302