New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve #1398: Online Indexer: support multi target indexing #1399
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, went through these changes. The changes requested are non-trivial, but the basic theme is basically that I think that there's an opportunity here to collapse some special cases (and remove some code dupe) if we do some refactoring. One of the more important places where I'd like to see that is that I think there's a fair amount of overlap between the MultiTypeIndexingByRecords
and the IndexingByRecords
classes, and it would probably be good to clean that up, given that the code being duplicated is non-trivial. But that might be a bit of an involved change, so I'm not sure how we want to go about doing that
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Show resolved
Hide resolved
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Show resolved
Hide resolved
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Outdated
Show resolved
Hide resolved
...t/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexerMultiTargetTest.java
Outdated
Show resolved
Hide resolved
...t/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexerMultiTargetTest.java
Show resolved
Hide resolved
I think that at some point, after sufficient testing, we should refactor the |
I think my inclination is that we should do that sooner rather than waiting for a later refactoring, given the complexity of the classes involved and the level of code duplication in the code currently. I suppose there's an argument against that if it gets multi-targeting in faster (and we really want the performance gains), but I think we'd rather do the refactoring now if possible. |
b05d481
to
3c59460
Compare
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
@@ -192,7 +187,7 @@ private Index getSourceIndex(RecordMetaData metaData) { | |||
|
|||
return iterateRangeOnly(store, cursor, | |||
this::getRecordIfTypeMatch, | |||
lastResult, hasMore, recordsScanned) | |||
lastResult, hasMore, recordsScanned, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is using snapshot isolation essential to scanning by an index?
This is sort of a theme here (in multiple comments on this PR), but it seems like this change is introducing code in a few places (here included) where assumptions about what features are and are not supported are being baked in in sort of subtle ways, and I'm concerned that they will be hard to disentangle in the future, which is why I'm bringing them up.
...-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingByIndex.java
Show resolved
Hide resolved
...r-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingCommon.java
Outdated
Show resolved
Hide resolved
...n/java/com/apple/foundationdb/record/provider/foundationdb/IndexingMultiTargetByRecords.java
Outdated
Show resolved
Hide resolved
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Outdated
Show resolved
Hide resolved
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Show resolved
Hide resolved
...n/java/com/apple/foundationdb/record/provider/foundationdb/IndexingMultiTargetByRecords.java
Show resolved
Hide resolved
} | ||
|
||
@Nonnull | ||
private CompletableFuture<Boolean> buildRangeOnly(@Nonnull FDBRecordStore store, byte[] startBytes, byte[] endBytes, @Nonnull AtomicLong recordsScanned) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of reiterating but: I think I'm skeptical that we want to take on the debt of having two diverging implementations here for the benefit of not disturbing the existing implementation. If we're not confident in our testing, then it seems like we should think about adding additional tests.
...t/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexerMultiTargetTest.java
Show resolved
Hide resolved
...t/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexerMultiTargetTest.java
Show resolved
Hide resolved
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that synthetic type support was added, and I think the record type selection now looks right to me (i.e., records only go to indexes of the right type now). Most of the things I've left comments on are more stylistic, though it also sounded like you might have another set of tweaks you wanted to make here
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...r-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingCommon.java
Outdated
Show resolved
Hide resolved
...r-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingCommon.java
Show resolved
Hide resolved
...t/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexerMultiTargetTest.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
} | ||
// Index is built because there is no missing range. | ||
System.out.println(("marking readable:" + index.getName())); | ||
return store.markIndexReadable(index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this method has always been like this, but it's a bit sketch. The markIndexReadable
method already checks the range set before it marks something as readable. Given that the old method also did this, this is probably fine, but it's worth being aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
c362f7b
to
e93c021
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
6159c8a
to
ccb98c3
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
Interesting. Not sure I've seen this test failure before:
It seems unlikely that this is caused by this change, though, and it's probably a flaky test somehow. Strange that we haven't seen it before |
AWS CodeBuild CI Report for Linux CentOS 7
|
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
4ea072e
to
95abfc3
Compare
Rebased. |
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor things remaining, but things are largely shaping up.
I think the "fundamental disagreement" is still there over whether this should go in like this (as a separate indexer) versus changing the existing "ByRecords" indexer. My preference, I think, would still be to just change the existing indexer and rely on test coverage to find any potential problems (or make our test coverage better if we think it's lacking somewhere). It's possible that we should take this any way, if you feel strongly, but I don't like the amount of duplicated code that this introduces, if I'm honest.
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
...yer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingBase.java
Outdated
Show resolved
Hide resolved
@@ -192,7 +187,7 @@ private Index getSourceIndex(RecordMetaData metaData) { | |||
|
|||
return iterateRangeOnly(store, cursor, | |||
this::getRecordIfTypeMatch, | |||
lastResult, hasMore, recordsScanned) | |||
lastResult, hasMore, recordsScanned, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I see; the idempotency requirement is a bit more intertwined with how the indexer updates than I guess I thought. I still think that this check here should base its interpretation of whether the index update is idempotent based on the actual index rather than hard-coding true
, though. The main reason is that it seems like it would be something that we'd easily forget to update if we came up with a way of building a non-idempotent index from another index. (For example, by checking if a record's corresponding index entry is in a built range.)
...r-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/IndexingCommon.java
Show resolved
Hide resolved
@@ -174,7 +174,7 @@ public IndexingScrubMissing(@Nonnull final IndexingCommon common, | |||
final long scanLimit = scrubbingPolicy.getEntriesScanLimit(); | |||
|
|||
return iterateRangeOnly(store, cursor, this::getRecordIfMissingIndex, | |||
lastResult, hasMore, recordsScanned) | |||
lastResult, hasMore, recordsScanned, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm still of the opinion (as expressed in a comment elsewhere) that hard-coding this to true
could be easily missed if we ever decide we want to not restrict scrubbers to idempotent indexes
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Show resolved
Hide resolved
...ava/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexerIndexFromIndexTest.java
Show resolved
Hide resolved
…split some functions. This seems to be a complicated way to reduce complexity. Most methods are, and should be, a "single task" leaf, but some are skeletons/trunks - and implementing a certain "storyline" agorithm. Spreading this algorithm among multiple functions makes it anything but more readable.
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
Yes. I strongly feel that it's important that for the time being, while multi indexing is new, users can use the good ol' |
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, a few minor cleanup things, mainly around making sure we have issues filed for any follow up work we want to do post-merge
@@ -257,9 +247,10 @@ private Index getSourceIndex(RecordMetaData metaData) { | |||
final AtomicReference<RecordCursorResult<FDBIndexedRecord<Message>>> lastResult = new AtomicReference<>(RecordCursorResult.exhausted()); | |||
final AtomicBoolean hasMore = new AtomicBoolean(true); | |||
|
|||
final boolean isIdempotent = true ; // Note that currently indexing by index is online implemented for idempotent indexes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an issue about supporting non-idempotent indexes being built from another index? If so, could you link it here so that there's something for a future implementor of that to search for. If not, I think one should probably be filed (and linked here). We probably want something similar for the scrubbers as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also - #1433 (Support multi target indexes while indexing by a source index)
...er-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/OnlineIndexer.java
Outdated
Show resolved
Hide resolved
* This indexer scans records to build multiple indexes. | ||
*/ | ||
@API(API.Status.INTERNAL) | ||
public class IndexingMultiTargetByRecords extends IndexingBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should probably be an issue filed about combining this with IndexingByRecords
(or maybe more generally about combining multi-target indexing with single-target indexing), even if this goes in as a separate indexer just for tracking purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kudos, SonarCloud Quality Gate passed! |
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
No description provided.