-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ORC-984: Save the software version that wrote each ORC file. #904
Conversation
The software version in the API and tool shows up as "ORC Java 1.8.0-SNAPSHOT". When run from IntelliJ the unit tests can't find the software version, so it comes in "ORC Java unknown". |
I haven't updated the C++ reader, writer, and tool yet. |
BTW, according to GitHub Action jobs, all Java11+ seems to complain for some reasons. |
This should be a bug in findbugs. Java 11 compiles try-with-resources to generate bytecode differently than java8. Findbugs analyze bytecode matche RCN_REDUNDANT_NULLCHECK_OF_NONNULL_VALUE. |
Ok, the changes for C++ were relatively straightforward. I did update the C++ with the new writer versions, such as the important ORC-14. I updated the orc_split_elim_new.orc so that it was written by the 1.8.0-SNAPSHOT. I also added orc_split_elim_cpp.orc that is the same file written by c++. The Java and C++ tools show the files as having being written by the corresponding versions. |
BTW, to generate the new c++ ORC file, I wrote a little utility that takes an ORC file and rewrites it using c++. It is in https://github.com/omalley/orc/tree/cpp-convert . |
So, everything is ready, right, @omalley ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @omalley .
Merged to main/1.7/1.6/1.5 according to the PR content.
I add a string to the file footer that records the version that wrote the file. We already had recorded the implementation that wrote the file as Footer.writer. I added a method that combines these two fields in the reader to produce a user facing string that describes the software version. I also add a field to the meta data tool to show the version that wrote the file. Because of that change and the fact that the files change size based on whether the ORC version is a snapshot or not, I had to extend the tests for TestFileDump to allow some slop for the size and to ignore the file version. It passes the unit tests after I updated the tools tests. (cherry picked from commit cf720d7) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
I add a string to the file footer that records the version that wrote the file. We already had recorded the implementation that wrote the file as Footer.writer. I added a method that combines these two fields in the reader to produce a user facing string that describes the software version. I also add a field to the meta data tool to show the version that wrote the file. Because of that change and the fact that the files change size based on whether the ORC version is a snapshot or not, I had to extend the tests for TestFileDump to allow some slop for the size and to ignore the file version. It passes the unit tests after I updated the tools tests. (cherry picked from commit cf720d7) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 7e3d557) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
I add a string to the file footer that records the version that wrote the file. We already had recorded the implementation that wrote the file as Footer.writer. I added a method that combines these two fields in the reader to produce a user facing string that describes the software version. I also add a field to the meta data tool to show the version that wrote the file. Because of that change and the fact that the files change size based on whether the ORC version is a snapshot or not, I had to extend the tests for TestFileDump to allow some slop for the size and to ignore the file version. It passes the unit tests after I updated the tools tests. (cherry picked from commit cf720d7) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 7e3d557) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 2aa9b06) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
I add a string to the file footer that records the version that wrote the file. We already had recorded the implementation that wrote the file as Footer.writer. I added a method that combines these two fields in the reader to produce a user facing string that describes the software version.
I also add a field to the meta data tool to show the version that wrote the file.
Because of that change and the fact that the files change size based on whether the ORC version is a snapshot or not, I had to extend the tests for TestFileDump to allow some slop for the size and to ignore the file version.
How was this patch tested?
It passes the unit tests after I updated the tools tests.