Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not check schema for equality in concat_batches #4815

Merged
merged 1 commit into from Sep 16, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Sep 13, 2023

Which issue does this PR close?

Closes #4800
Closes #4799

Rationale for this change

Per discussion on #4801, this PR stops validating that schemas of the concatenated batches are exactly equal and instead uses the provided schema

While this may hide some subtle bugs in downstream crates, it avoids arrow-rs enforcing an invariant (e.g. that schema equality also includes field metadata) that is not in other downstream crates

Note that mismatched types still generate a useful error message which should catch any egrious errors.

What changes are included in this PR?

  1. Remove the check for schema equality
  2. Update existing test
  3. Add new test showing what happens with mismatched types

Are there any user-facing changes?

@alamb
Copy link
Contributor Author

alamb commented Sep 14, 2023

I plan to leave this open for another day or two in case anyone else has feedback on the approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concat_batches errors with "schema mismatch" error when only metadata differs
2 participants