New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update thrift v0.16 and vendor parquet-format (#2502) #2626
Changes from 4 commits
8c4aa31
fef92be
bebbd30
e92c3c5
7736da7
73085cd
12dcc81
1d2dda5
7d1cfff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1 @@ | ||
r/R/RcppExports.R linguist-generated=true | ||
r/R/arrowExports.R linguist-generated=true | ||
r/src/RcppExports.cpp linguist-generated=true | ||
r/src/arrowExports.cpp linguist-generated=true | ||
r/man/*.Rd linguist-generated=true | ||
|
||
parquet/src/format.rs linguist-generated | ||
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,7 +60,18 @@ Run `cargo bench` for benchmarks. | |
To build documentation, run `cargo doc --no-deps`. | ||
To compile and view in the browser, run `cargo doc --no-deps --open`. | ||
|
||
## Update Supported Parquet Version | ||
## Update Parquet Format | ||
|
||
To update Parquet format to a newer version, check if [parquet-format](https://github.com/sunchao/parquet-format-rs) | ||
version is available. Then simply update version of `parquet-format` crate in Cargo.toml. | ||
To generate the parquet format code run | ||
tustvold marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
``` | ||
$ git clone https://github.com/apache/thrift | ||
$ cd thrift | ||
$ git checkout v0.16.0 | ||
# docker build just builds a docker image with thrift dependencies | ||
$ docker build -t thrift build/docker/ubuntu-bionic | ||
tustvold marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# build/docker/scripts/cmake.sh actually compiles thrift | ||
$ docker run -v $(pwd):/thrift/src -it thrift build/docker/scripts/cmake.sh && wget https://raw.githubusercontent.com/apache/parquet-format/apache-parquet-format-2.9.0/src/main/thrift/parquet.thrift && ./cmake_build/compiler/cpp/bin/thrift --gen rs parquet.thrift | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This command did not complete successfully for me. ...
441: ----------------------------------------------------------------------
441: Ran 2 tests in 0.012s
441:
441: OK
441: Traceback (most recent call last):
441: File "/thrift/src/test/py/TestServer.py", line 403, in <module>
441: from thrift.TMultiplexedProcessor import TMultiplexedProcessor
441: File "/thrift/src/lib/py/build/lib.linux-x86_64-2.7/thrift/TMultiplexedProcessor.py", line 20, in <module>
441: from thrift.Thrift import TProcessor, TMessageType
441: ImportError: No module named Thrift
441: t.py
441: ----
441: ----------------
441: Executing Client/Server tests with various generated code directories
441: Servers to be tested: TSimpleServer, TThreadedServer, TThreadPoolServer, TNonblockingServer, THttpServer, TProcessPoolServer, TForkingServer
441: Directories to be tested: gen-py-default, gen-py-slots, gen-py-oldstyle, gen-py-no_utf8strings, gen-py-dynamic, gen-py-dynamicslots
441: Protocols to be tested: accel, accelc, binary, compact, json, header
441: Options to be tested: ZLIB(yes/no), SSL(yes/no)
441: ----------------
441:
441: Test run #0: (includes gen-py-default) Server=TSimpleServer, Proto=accel, zlib=False, SSL=False
441: Testing server TSimpleServer: /usr/bin/python /thrift/src/test/py/TestServer.py --protocol=accel --port=9090 TSimpleServer
441: FAIL: Server process (/usr/bin/python /thrift/src/test/py/TestServer.py --protocol=accel --port=9090 TSimpleServer) failed with retcode 1
441: Traceback (most recent call last):
441: File "/thrift/src/test/py/RunClientServer.py", line 323, in <module>
441: sys.exit(main())
441: File "/thrift/src/test/py/RunClientServer.py", line 315, in main
441: tests.test_feature('gendir', generated_dirs)
441: File "/thrift/src/test/py/RunClientServer.py", line 230, in test_feature
441: if self.run(conf, test_count):
441: File "/thrift/src/test/py/RunClientServer.py", line 219, in run
441: runServiceTest(self.libdir, self.genbase, genpydir, try_server, try_proto, self.port, with_zlib, with_ssl, self.verbose)
441: File "/thrift/src/test/py/RunClientServer.py", line 157, in runServiceTest
441: ensureServerAlive()
441: File "/thrift/src/test/py/RunClientServer.py", line 140, in ensureServerAlive
441: % (server_class, ' '.join(server_args)))
441: Exception: Server subprocess TSimpleServer died, args: /usr/bin/python /thrift/src/test/py/TestServer.py --protocol=accel --port=9090 TSimpleServer
441/441 Test #441: python_test .......................***Failed 28.49 sec
99% tests passed, 6 tests failed out of 441
Total Test time (real) = 250.09 sec
The following tests FAILED:
382 - TInterruptTest (Failed)
401 - TNonblockingSSLServerTest (Failed)
403 - SecurityTest (Failed)
404 - SecurityFromBufferTest (Failed)
433 - PythonTestSSLSocket (Failed)
441 - python_test (Failed)
Errors while running CTest There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I ended up just using the version vended by Arch linux... Perhaps I will just update the instructions to do that 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah -- maybe docker run a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done, although of course the formatting is now different... But at least we now have a one-liner |
||
``` | ||
|
||
Then copy the generated `parquet.rs` into `src/format.rs` and commit changes. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,9 +31,8 @@ rust-version = "1.62" | |
|
||
[dependencies] | ||
ahash = "0.8" | ||
parquet-format = { version = "4.0.0", default-features = false } | ||
bytes = { version = "1.1", default-features = false, features = ["std"] } | ||
thrift = { version = "0.13", default-features = false } | ||
thrift = { version = "0.16", default-features = false } | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
snap = { version = "1.0", default-features = false, optional = true } | ||
brotli = { version = "3.3", default-features = false, features = ["std"], optional = true } | ||
flate2 = { version = "1.0", default-features = false, features = ["rust_backend"], optional = true } | ||
|
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍