Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force table / view location #8516

Merged

Conversation

adutra
Copy link
Contributor

@adutra adutra commented May 14, 2024

No description provided.

"Invalid create requirements: %s",
invalidRequirements);
String location = defaultTableLocation(decoded.warehouse(), key);
builder.addUpdate(setLocation(location));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if AssertCreate is missing? Aren't we losing the SetLocation update in that case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we are. That's the idea: ignore any locations sent by the client, and use our own.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean should we not do builder.addUpdate(setLocation(defaultLocation)) even if/when AssertCreate is not present?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think not, because in this case it's an update, so the table already exists at its default location. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: AssertCreate is a requirement that the client chooses to submit. Can we be sure that a table cannot be created without submitting an AssertCreate?

Copy link
Member

@dimas-b dimas-b May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it seems that it is not possible to create a table without an AssertCreate now, it seems to me that it's a co-incidence rather than a deliberate check. From the REST API perspective an AssertCreate does not have to be present.

Should we set the catalog-managed location in CatalogServiceImpl perhaps rather than at the REST level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was leaning towards that already, and you convinced me. I will revisit this PR entirely.

@snazy snazy force-pushed the feature/nessie-catalog-server branch from 939f4c8 to a710182 Compare May 15, 2024 10:23
@dimas-b dimas-b self-requested a review May 15, 2024 15:26
@adutra
Copy link
Contributor Author

adutra commented May 15, 2024

I tried to push the location computation down to CatalogServiceImpl, however many tests are now failing, and I still have no clue why. Will investigate more tomorrow.

String baseLocation = warehouseLocation;
Namespace ns = key.getNamespace();
if (!ns.isEmpty()) {
baseLocation = concatLocation(warehouseLocation, ns.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the key elements are probably easier to handle - would also prevent the namespace check/handling.

@@ -755,7 +757,33 @@ public static long safeUnbox(Long value, long defaultValue) {
return value != null ? value : defaultValue;
}

public static NessieTableSnapshot newIcebergTableSnapshot(List<IcebergMetadataUpdate> updates) {
public static String defaultIcebergLocation(String warehouseLocation, ContentKey key) {
String baseLocation = warehouseLocation;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we should respect the "object store naming strategy" here - or maybe apply it unconditionally. @dimas-b wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense. Also, @adutra and I agreed (other comment thread) to tackle actual location construction in a follow-up PR :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer to do this here, fine with me. You tell me :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe flat structure (no namespace) in this PR, plus more advanced strategies as a follow-up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "object store naming strategy", is it whether to use path-style or virtual-host style?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah - with "naming strategy" I meant the "randomized prefix" from o.a.iLocationProviders.ObjectStoreLocationProvider - but it seems my brain was somewhere else, because this change is about the base location - the the actual object location. Please disregard my comment and ignore everything I said or recommended above.

// TODO: support TableProperties.WRITE_METADATA_LOCATION
String location =
String.format(
"%s_%s/metadata/00000-%s.metadata.json", baseLocation, randomUUID(), randomUUID());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the 1st UUID the table's ("fixed") UUID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was mixing base location and metadata json location. It's fixed now.

*
* <p>If not set, the default warehouse will be used.
*/
String warehouse();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be Optional<String>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my first impl, but in fact there is no need of that because of how CatalogConfig.getWarehouse is implemented: it already handles nulls gracefully. I should have added @Nullable here though, I will fix that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFM

@@ -505,16 +494,8 @@ private CompletionStage<MultiTableUpdate> applyIcebergTableCommitOperation(
})
.thenApply(
nessieSnapshot -> {
// TODO: support GZIP
// TODO: support TableProperties.WRITE_METADATA_LOCATION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: We want ignore this property, yes.

@@ -138,14 +145,33 @@ public Uni<Void> commitTransaction(
commitTransactionRequest.tableChanges().stream()
.map(
tableChange -> {
List<IcebergMetadataUpdate> updates =
tableChange.updates().stream()
.filter(update -> !(update instanceof SetLocation))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we should make the SetLocation update-operation a noop?

tableChange.requirements().stream()
.filter(req -> !(req instanceof AssertCreate))
.collect(Collectors.toList());
checkArgument(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we already do this implicitly via the tableExists/viewExists parameter in IcebergUpdateRequirement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal here is to make sure that, if assert-create is present, then no other requirement is present. I don't see this being done in IcebergUpdateRequirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. But wouldn't that not work with if (requirements.size()!=1) ?

dimas-b
dimas-b previously approved these changes May 16, 2024
// To avoid sharing same table path between two tables with same name, use uuid in the table
// path.
// Also: we deliberately ignore the TableProperties.WRITE_METADATA_LOCATION property here.
// TODO: support GZIP ??
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: How is GZIP related to base locations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved the TODO form its old location to here. I can remove it.

if (!ns.isEmpty()) {
baseLocation = concatLocation(warehouseLocation, ns.toString());
}
baseLocation = concatLocation(baseLocation, key.getName());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we go with a flat namespace until we resolve #8524?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry... what is a flat namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant no ns in the path, but I guess key.getName() can also contain special chars, so I'm ok leaving this method "as is" for now and addressing later, under #8524

@snazy snazy dismissed dimas-b’s stale review May 17, 2024 07:57

The merge-base changed after approval.

@snazy snazy force-pushed the feature/nessie-catalog-server branch from b8c6df1 to b1b16ff Compare May 17, 2024 07:57
dimas-b
dimas-b previously approved these changes May 17, 2024
@adutra adutra enabled auto-merge (squash) May 17, 2024 15:05
@adutra adutra merged commit 35af148 into projectnessie:feature/nessie-catalog-server May 17, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants