Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on Key Overflow in Dictionary Builders #3562

Closed
tustvold opened this issue Jan 19, 2023 · 3 comments · Fixed by #3563
Closed

Panic on Key Overflow in Dictionary Builders #3562

tustvold opened this issue Jan 19, 2023 · 3 comments · Fixed by #3563
Labels
arrow Changes to the arrow crate parquet Changes to the parquet crate question Further information is requested

Comments

@tustvold
Copy link
Contributor

tustvold commented Jan 19, 2023

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I suspect this will be a controversial issue, but there a couple of reasons for proposing this

  • Rust collections and accompanying traits like FromIterator and Extend assume infallibility
  • We currently panic on offset overflow for StringArray, ByteArray, ListArray, etc...
  • We already panic on offset overflow in FromIterator for DictionaryArray
  • The error is just going to be surfaced to the user anyway, a panic is perfectly acceptable for this imo

Describe the solution you'd like

Panic on key overflow instead of returning an error

Describe alternatives you've considered

We could implement Extend and panic on key overflow there, much like we do for FromIterator, but I feel we should be consistent about this.

We could also implement our own TryExtend trait or something similar.

Additional context

#2725 tracks a more holistic rework of how we handle errors

@tustvold tustvold added enhancement Any new improvement worthy of a entry in the changelog question Further information is requested and removed enhancement Any new improvement worthy of a entry in the changelog labels Jan 19, 2023
@alamb
Copy link
Contributor

alamb commented Jan 19, 2023

My personal opinion is that the client code should be able to avoid panic's if the user desires, but that doesn't mean the API needs to be fallible.

For example, as long as it is possible to pre-compute / estimate when a panic would occur (either explained via docs or some sort of calculation with https://docs.rs/arrow/31.0.0/arrow/array/struct.ArrayData.html#method.get_slice_memory_size or something) then 👍

@tustvold
Copy link
Contributor Author

label_issue.py automatically added labels {'arrow'} from #3563

@tustvold tustvold added arrow Changes to the arrow crate parquet Changes to the parquet crate labels Jan 27, 2023
@tustvold
Copy link
Contributor Author

label_issue.py automatically added labels {'parquet'} from #3553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate parquet Changes to the parquet crate question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants