-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-35049][state] Implement Map Async State API for ForStStateBackend #24812
base: master
Are you sure you want to change the base?
Conversation
480b425
to
02c4d45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! I'm still reviewing and will put comments incrementally.
@@ -31,13 +32,19 @@ | |||
*/ | |||
public class ForStStateRequestClassifier implements StateRequestContainer { | |||
|
|||
private final StateRequestHandler stateRequestHandler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unused?
* | ||
* @param <K> The type of key in original state access request. | ||
*/ | ||
public class ForStDBBunchPutRequest<K> extends ForStDBPutRequest<ContextKey<K>, Map<?, ?>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this extension lacks practical meaning......
I'd suggest we re-organize the interfaces. ForStWriteBatchOperation
manipulating some new introduced ForStSingleWriteOperation
, which is wrapped and iterative given by ForStWriteBatchOperation
and ForStDBPutRequest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the logic of interacting with the db
to the ForStDBPutRequest
and ForStDBBunchPutRequest
, to avoid the classifications in ForStWriteBatchOperation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @fredia, thanks for the PR. I left several comments, but I am still beginner of rockdb, please correct me if I am wrong :-)
public ForStDBBunchPutRequest( | ||
ContextKey<K> key, Map value, ForStMapState table, InternalStateFuture<Void> future) { | ||
super(key, value, table, future); | ||
Preconditions.checkArgument(table instanceof ForStMapState); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check seems redundant.
if (!find) { | ||
byte[] newBytes = new byte[len + 1]; | ||
System.arraycopy(bytes, 0, newBytes, 0, len); | ||
newBytes[len] = 1; | ||
return newBytes; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding, if the byteArr
is maximum value. The next should like this:
if (!find) { | |
byte[] newBytes = new byte[len + 1]; | |
System.arraycopy(bytes, 0, newBytes, 0, len); | |
newBytes[len] = 1; | |
return newBytes; | |
} | |
if (!find) { | |
byte[] newBytes = new byte[len + 1]; | |
System.arraycopy(bytes, 0, newBytes, 0, len); | |
newBytes[len] = Byte.MIN_VALUE + 1; | |
return newBytes; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion, after reading the code in rocksdb, I think newBytes[len] = Byte.MAX_VALUE
is enough.
inline int Slice::compare(const Slice& b) const {
assert(data_ != nullptr && b.data_ != nullptr);
const size_t min_len = (size_ < b.size_) ? size_ : b.size_;
int r = memcmp(data_, b.data_, min_len);
if (r == 0) {
if (size_ < b.size_)
r = -1;
else if (size_ > b.size_)
r = +1;
}
return r;
}
this.key = key; | ||
this.table = table; | ||
this.future = future; | ||
this.toBoolean = toBoolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toBoolean
seems not very clear. I would suggest to add ForStDBExistRequest
for this kind of operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍,I added ForStDBMapCheckRequest
for this kind of operation.
if (request instanceof ForStDBBunchPutRequest) { | ||
ForStDBBunchPutRequest<?> bunchPutRequest = | ||
(ForStDBBunchPutRequest<?>) request; | ||
byte[] primaryKey = bunchPutRequest.buildSerializedKey(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I cannot understand this line, why serialize null
here? Why the Bunch remove
and Single remove
are using different serialization method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For map state, there are two types of key: primary key and user key.
Logically it is organized like this:
<primary key, <user key, user value>
We concatenate primary key and user key together as the key of rocksdb, and physically organized like this:
< <primary key - user key> , user value>
For Bunch remove
, we will remove all user keys under one primary key, so we use the prefix to match all keys.
For Single remove
, we use the bytes of <primary key
-user key
> to match one specific key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
PTAL my comments.
ForStDBBunchPutRequest<?> bunchPutRequest = | ||
(ForStDBBunchPutRequest<?>) request; | ||
byte[] primaryKey = bunchPutRequest.buildSerializedKey(null); | ||
byte[] endKey = ForStDBBunchPutRequest.nextBytes(primaryKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may missed something.
What's the logic of calculating and using deleteRange
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We concatenate primary key and user key together as the key of rocksdb, and physically organized like this:
< <primary key1 - user key1> , user value1>
< <primary key1 - user key2> , user value2>
< <primary key1 - user key3> , user value3>
< <primary key1 - user key4> , user value4>
Here, we use primary key as a prefix to calculate an interval to leverage deleteRange
.
But too much deleteRange
may affect the read performance, and we will consider changing it iterator.delete later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that deleteRange
may be ambiguous in some cases, such as when the primary key is an array of Byte. MAX_VALUE
, so I changed it to iterator and delete one by one.
|
||
private ForStDBGetRequest(K key, ForStInnerTable<K, V> table, InternalStateFuture<V> future) { | ||
private final boolean toBoolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this have a subclass to make the structure more clear ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍I added ForStDBMapCheckRequest
for this kind of operation.
executor.execute( | ||
() -> { | ||
// todo: config read options | ||
try (RocksIterator iter = db.newIterator(request.getColumnFamilyHandle())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems a bit complex.
Could you add some descriptions in some key steps or split them into methods ?
} | ||
|
||
public boolean valueIsNull() { | ||
return value == null; | ||
} | ||
|
||
public boolean valueIsMap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as GetRequest
.
Maybe we could have a more clear structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the logic of interacting with the db to the ForStDBPutRequest
and ForStDBBunchPutRequest
, valueIsMap
is deleted now.
request.completeStateFuture(stateIterator); | ||
} catch (Exception e) { | ||
LOG.warn("Error when process iterate operation for forStDB", e); | ||
future.completeExceptionally(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember to completeExceptionally
for InternalStateFuture
.
dbPutRequests.add(forStMapState.buildDBBunchPutRequest(stateRequest)); | ||
return; | ||
} | ||
case MAP_IS_EMPTY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Same as MAP_GET
?
What is the purpose of the change
This PR implements Map Async State API for ForStStateBackend.
Brief change log
ForStDBIterRequest
andForStIterateOperation
forMap#iterator
ForStDBBunchPutRequest
forMap#putAll
andMap#clear
ForStDBPutRequest
forMap#put
, andForStDBGetRequest
forMap#value
Verifying this change
This change added tests and can be verified as follows:
ForStDBIterateOperationTest
ForStDBOperationTestBase
.Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: ( no)Documentation