Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Java] Can I transfer Schema objects between languages without Arrow Flight? #39792

Closed
MicroGery opened this issue Jan 25, 2024 · 8 comments

Comments

@MicroGery
Copy link

Describe the usage question you have. Please include as many useful details as possible.

I'm using arrow 9.0, and I've now passed the byte array from the Schema#toByteArray method in Java to the C++ side, but I didn't find a way to build an arrow::schema object from the byte array in arrow::Schema. According to the previous issue 37704, the toByteArray method has been discarded and changed to the Schema#serializeAsMessage method.
So my question remains, can I construct an arrow::schema object on the C++ side through a byte[] array for subsequent RecordBatch construction?
Is there documentation to guide me on what to do? Thanks~
In addition, I'm using Graalvm's native-image capabilities to implement data transfer between C++ and Java.

Component(s)

C++, FlightRPC, Java

@MicroGery
Copy link
Author

cc @lidavidm

@MicroGery
Copy link
Author

The byte array I got through the serializeAsMessage interface is
[-1, -1, -1, -1, 0, 3, 0, 0, 16, 0, 0, 0, 0, 0, 10, 0, 14, 0, 6, 0, 13, 0, 8, 0, 10, 0, 0, 0, 0, 0, 4, 0, 16, 0, 0, 0, 0, 1, 10, 0, 12, 0, 0, 0, 8, 0, 4, 0, 10, 0, 0, 0, 8, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, -124, 2, 0, 0, 48, 2, 0, 0, -16, 1, 0, 0, -84, 1, 0, 0, 108, 1, 0, 0, 36, 1, 0, 0, -36, 0, 0, 0, -108, 0, 0, 0, 72, 0, 0, 0, 4, 0, 0, 0, -74, -3, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 7, 1, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -16, -2, -1, -1, 10, 0, 0, 0, 38, 0, 0, 0, 14, 0, 0, 0, 121, 116, 100, 95, 105, 109, 112, 97, 105, 114, 109, 101, 110, 116, 0, 0, -10, -3, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 7, 1, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 48, -1, -1, -1, 10, 0, 0, 0, 38, 0, 0, 0, 20, 0, 0, 0, 100, 101, 112, 114, 101, 99, 105, 97, 116, 105, 111, 110, 95, 114, 101, 115, 101, 114, 118, 101, 0, 0, 0, 0, 62, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 7, 1, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 120, -1, -1, -1, 10, 0, 0, 0, 38, 0, 0, 0, 16, 0, 0, 0, 121, 116, 100, 95, 100, 101, 112, 114, 101, 99, 105, 97, 116, 105, 111, 110, 0, 0, 0, 0, -126, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 7, 1, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -68, -1, -1, -1, 10, 0, 0, 0, 38, 0, 0, 0, 19, 0, 0, 0, 100, 101, 112, 114, 101, 99, 105, 97, 116, 105, 111, 110, 95, 97, 109, 111, 117, 110, 116, 0, -58, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 28, 0, 0, 0, 0, 0, 7, 1, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 12, 0, 8, 0, 4, 0, 8, 0, 0, 0, 10, 0, 0, 0, 38, 0, 0, 0, 10, 0, 0, 0, 100, 101, 112, 114, 110, 95, 99, 111, 115, 116, 0, 0, 10, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -8, -2, -1, -1, 17, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 102, 105, 108, 101, 95, 110, 97, 109, 101, 0, 0, 0, 70, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 52, -1, -1, -1, 22, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 112, 97, 114, 116, 105, 116, 105, 111, 110, 95, 112, 97, 116, 104, 0, 0, -122, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 116, -1, -1, -1, 18, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 114, 101, 99, 111, 114, 100, 95, 107, 101, 121, 0, 0, -62, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -80, -1, -1, -1, 20, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 99, 111, 109, 109, 105, 116, 95, 115, 101, 113, 110, 111, 0, 0, 18, 0, 24, 0, 20, 0, 19, 0, 18, 0, 12, 0, 0, 0, 8, 0, 4, 0, 18, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 24, 0, 0, 0, 0, 0, 5, 1, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 4, 0, 4, 0, 0, 0, 19, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 99, 111, 109, 109, 105, 116, 95, 116, 105, 109, 101, 0]
According to #13853
Does this mean that the length of the subsequent data is 768 bytes?

@lidavidm
Copy link
Member

Schema#toByteArray method in Java to the C++ side, but I didn't find a way to build an arrow::schema object from the byte array in arrow::Schema

This method will not work. Use MessageSerializer.

/// \brief Read Schema from stream serialized as a single IPC message
/// and populate any dictionary-encoded fields into a DictionaryMemo
///
/// \param[in] stream an InputStream
/// \param[in] dictionary_memo for recording dictionary-encoded fields
/// \return the output Schema
///
/// If record batches follow the schema, it is better to use
/// RecordBatchStreamReader
ARROW_EXPORT
Result<std::shared_ptr<Schema>> ReadSchema(io::InputStream* stream,
DictionaryMemo* dictionary_memo);

@MicroGery
Copy link
Author

jduo:37704-java-schema-tobytearray
` /**

  • Returns the serialized flatbuffer bytes of the schema wrapped in a message table.
  • Use {@link #deserializeMessage() to rebuild the Schema.}
    */
    public byte[] serializeAsMessage() {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    try (WriteChannel channel = new WriteChannel(Channels.newChannel(out))) {
    long size = MessageSerializer.serialize(
    new WriteChannel(Channels.newChannel(out)), this);
    return out.toByteArray();
    } catch (IOException ex) {
    throw new RuntimeException(ex);
    }
    }`
    Yes, I used the serializeAsMessage method, which uses MessageSerializer, and I got an array starting with oxffffff. According to @zeroshade , the valid length is 768 bytes indicated by the second 32-bit integer. Now I'm trying to construct an arrow::Schema object in C++ code using the ReadSchema method you said. How do I create an io::InputStream? Is it through Buffer or BufferReader?

@MicroGery
Copy link
Author

MicroGery commented Jan 27, 2024

The byte array I got through MessageSerializer is
-1, -1, -1, -1, -24, 2, 0, 0, 16, 0, 0, 0, 0, 0, 10, 0, 14, 0, 6, 0, 13, 0, 8, 0, 10, 0, 0, 0, 0, 0, 4, 0, 16, 0, 0, 0, 0, 1, 10, 0, 12, 0, 0, 0, 8, 0, 4, 0, 10, 0, 0, 0, 8, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 104, 2, 0, 0, 20, 2, 0, 0, -44, 1, 0, 0, -112, 1, 0, 0, 80, 1, 0, 0, 24, 1, 0, 0, -44, 0, 0, 0, -116, 0, 0, 0, 64, 0, 0, 0, 4, 0, 0, 0, -46, -3, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -64, -3, -1, -1, 15, 0, 0, 0, 97, 115, 115, 101, 116, 95, 98, 111, 111, 107, 95, 99, 111, 100, 101, 0, 10, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 2, 1, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -72, -1, -1, -1, 0, 0, 0, 1, 64, 0, 0, 0, 21, 0, 0, 0, 100, 101, 112, 114, 110, 95, 100, 105, 115, 116, 114, 105, 98, 117, 116, 105, 111, 110, 95, 105, 100, 0, 0, 0, 82, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 28, 0, 0, 0, 0, 0, 2, 1, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 12, 0, 8, 0, 7, 0, 8, 0, 0, 0, 0, 0, 0, 1, 64, 0, 0, 0, 8, 0, 0, 0, 97, 115, 115, 101, 116, 95, 105, 100, 0, 0, 0, 0, -106, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -124, -2, -1, -1, 20, 0, 0, 0, 116, 97, 120, 95, 100, 101, 99, 108, 97, 114, 97, 116, 105, 111, 110, 95, 105, 116, 101, 109, 0, 0, 0, 0, -42, -2, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -60, -2, -1, -1, 11, 0, 0, 0, 102, 97, 95, 116, 97, 120, 95, 116, 121, 112, 101, 0, 10, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -8, -2, -1, -1, 17, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 102, 105, 108, 101, 95, 110, 97, 109, 101, 0, 0, 0, 70, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 52, -1, -1, -1, 22, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 112, 97, 114, 116, 105, 116, 105, 111, 110, 95, 112, 97, 116, 104, 0, 0, -122, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 116, -1, -1, -1, 18, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 114, 101, 99, 111, 114, 100, 95, 107, 101, 121, 0, 0, -62, -1, -1, -1, 20, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 0, 0, 5, 1, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -80, -1, -1, -1, 20, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 99, 111, 109, 109, 105, 116, 95, 115, 101, 113, 110, 111, 0, 0, 18, 0, 24, 0, 20, 0, 19, 0, 18, 0, 12, 0, 0, 0, 8, 0, 4, 0, 18, 0, 0, 0, 20, 0, 0, 0, 20, 0, 0, 0, 24, 0, 0, 0, 0, 0, 5, 1, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 4, 0, 4, 0, 0, 0, 19, 0, 0, 0, 95, 104, 111, 111, 100, 105, 101, 95, 99, 111, 109, 109, 105, 116, 95, 116, 105, 109, 101, 0, 0, 0, 0, 0
Is it correct?

@lidavidm
Copy link
Member

Use BufferReader.

@vibhatha
Copy link
Collaborator

Has this issue been resolved or need further information?

@MicroGery
Copy link
Author

Sorry, it has been resolved. I'll close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants