Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error propagation #812

Open
wjones127 opened this issue Sep 15, 2022 · 4 comments
Open

Improve error propagation #812

wjones127 opened this issue Sep 15, 2022 · 4 comments
Labels
enhancement New feature or request

Comments

@wjones127
Copy link
Collaborator

Description

I'm starting to think we could do a better job of propagating errors, particularly up to Python. For example, we have to detect errors in Python through strings right now:

except PyDeltaTableError as err:
# TODO: There has got to be a better way...
if "Not a Delta table" in str(err):
return None
elif "cannot find" in str(err):
return None
elif "No such file or directory" in str(err):
return None
else:
raise

I'll try to think more about this, but if anyone's seen another library that approaches this well, I'd love a reference.

Use Case

Related Issue(s)

@wjones127 wjones127 added the enhancement New feature or request label Sep 15, 2022
@roeap
Copy link
Collaborator

roeap commented Sep 15, 2022

Personally, I like the approach we took in object_store. So far I am not completely sure it applies here, but the general idea would be to try and isolate all module specific errors and make them private in their modules. Then map these errors to the top-level error DeltaTableError. Everything were we feel a user / consumer would act directly on, would get a top level variant, while everything else gets "swallowed" in the Generic variant with a Box'ed source, so you can get at least a very informative message. This is mostly achieved by having very specific error in the modules, and the generic ones just get printed ... The challenge here might be that we have potentially much more error variants that we would want to expose then in object store.

On the python side we would then of course have to map this again to python native or custom errors we define there ...

Recently I took one step in that direction in #784.

@tustvold
Copy link

tustvold commented Sep 20, 2022

FWIW there is an issue in the arrow repo related to error handling that might be of interest - apache/arrow-rs#2725

@wjones127
Copy link
Collaborator Author

This error in particular seems to obscure a lot of authentication and connectivity issues:

deltalake.PyDeltaTableError: Failed to load checkpoint: Invalid JSON in checkpoint: expected value at line 1 column 1

@houqp
Copy link
Member

houqp commented Oct 5, 2022

@wjones127 perhaps there is a change of semantic in how objectstore-rs handles 403 errors compared to the previous rusoto implementation? The current get_last_checkpoint implementation assumes object storage backend to return an Err on 403, which doesn't seem like what's happening here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants