Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Run End Encoding DataType #3534

Closed
wants to merge 3 commits into from
Closed

Conversation

viirya
Copy link
Member

@viirya viirya commented Jan 16, 2023

Which issue does this PR close?

Part of #3520.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 16, 2023
@viirya viirya marked this pull request as draft January 16, 2023 01:36
@github-actions github-actions bot added the parquet Changes to the parquet crate label Jan 16, 2023
@viirya viirya marked this pull request as ready for review January 16, 2023 03:44
arrow-data/src/data.rs Outdated Show resolved Hide resolved
///
/// These child arrays are prescribed the standard names of "run_ends" and "values"
/// respectively.
RunEndEncodedType(Box<Field>, Box<Field>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering what's the difference between using Box<Field> vs Box<DataType>? Dictionary uses Box<DataType> whereas Struct uses Box<Field>. I think run_ends should just be DataType as it's very similar to Buffer but in child array.

Copy link
Member Author

@viirya viirya Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run-end encode type has child arrays with no buffers. Similar to Struct, I treat it as two Fields. I think it makes sense for values to be Field as it is possibly to be a dictionary. I remember it is necessary it to be a field for IPC serialization on dictionary. run-ends is just primitive one, it could be just DataType, I think.

@tustvold
Copy link
Contributor

Closing as a believe this has been superceded by #3553

@tustvold tustvold closed this Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants