Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove JsonEqual trait #2296

Closed
tustvold opened this issue Aug 3, 2022 · 6 comments · Fixed by #2317
Closed

Remove JsonEqual trait #2296

tustvold opened this issue Aug 3, 2022 · 6 comments · Fixed by #2317
Assignees
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog question Further information is requested

Comments

@tustvold
Copy link
Contributor

tustvold commented Aug 3, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

All Array implementations must implement the JsonEqual trait. However:

Describe the solution you'd like

Rather than maintaining parallel logic to interpret serde_json::Value as arrow arrays, I wonder if we can just remove the JsonEqual functionality and encourage conversion to arrow arrays. I'm unclear of a use-case that would require highly-performant comparison of arrow and JSON data

Describe alternatives you've considered

Additional context

@tustvold tustvold added question Further information is requested enhancement Any new improvement worthy of a entry in the changelog labels Aug 3, 2022
@tustvold
Copy link
Contributor Author

tustvold commented Aug 3, 2022

This appears to be part of the integration test plumbing, which makes its inclusion in the public API even more unusual?

FYI @viirya

@viirya
Copy link
Member

viirya commented Aug 3, 2022

I think in the integration test, JSON data is read and converted to arrow array before comparing? Does it compare JSON with arrow arrays directly?

@tustvold
Copy link
Contributor Author

tustvold commented Aug 3, 2022

It at least appears to be using the JsonEqual functionality, but I'm not very familiar with this part of the codebase

@viirya
Copy link
Member

viirya commented Aug 3, 2022

Oh, I will check it. I might miss it now.

@viirya
Copy link
Member

viirya commented Aug 3, 2022

At second glance, JsonEqual functionality is used by ArrowJson to compare with another RecordBatchReader. Underlying JSON record batch ArrowJsonBatch is compared with RecordBatch.

Although the stuffs like ArrowJson is under integration_util, and it is indeed used by integration test. But it seems to be only used as intermediate format to read JSON data and convert to arrow array. I don't see that JSON is compared directly with arrow arrays (I may miss it if it is deep).

Currently seems the JsonEqual functionality is only used in some IPC tests where JSON files are read and compare with arrow arrays without converting.

For that tests, I think we can follow integration test to convert JSON to arrow arrays before comparing.

I can take some time on this.

@viirya viirya self-assigned this Aug 4, 2022
@jhorstmann
Copy link
Contributor

For that tests, I think we can follow integration test to convert JSON to arrow arrays before comparing.

As another data point, in our integration tests we convert result RecordBatches to json to compare against the expected results. So removing the JsonEqual support should be fine.

@alamb alamb added the arrow Changes to the arrow crate label Aug 4, 2022
@alamb alamb changed the title Remove JsonEqual Remove JsonEqual trait Aug 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants