Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interfaces for supporting zero copy into Protobuf (HasByteBuffer & Detachable) #7387

Closed
voidzcy opened this issue Sep 1, 2020 · 6 comments · Fixed by #10079
Closed

Interfaces for supporting zero copy into Protobuf (HasByteBuffer & Detachable) #7387

voidzcy opened this issue Sep 1, 2020 · 6 comments · Fixed by #10079
Labels
experimental API Issue tracks stabilizing an experimental API
Milestone

Comments

@voidzcy
Copy link
Contributor

voidzcy commented Sep 1, 2020

APIs for supporting zero-copy protobuf deserialization from ByteBuffers to protobuf messages. The marshaller can wrap ByteBuffer as ByteSting with UnsafeByteOperations, concatenate ByteStrings into a single RopeByteString and then create a CodedInputStream from it.

New interfaces for the inbound InputStream to allow the marshaller:

  • HasByteBuffer: access the underlying ByteBuffers directly without copying bytes
  • Detachable: keep the ByteBuffers around until the application code is done with using the protobuf messages.

More details and an example marshaller implementation can be seen in #8102 (comment).

@voidzcy voidzcy added the experimental API Issue tracks stabilizing an experimental API label Sep 1, 2020
@voidzcy voidzcy added this to the Next milestone Sep 3, 2020
@voidzcy voidzcy closed this as completed Sep 25, 2020
@ejona86 ejona86 removed this from the Next milestone Sep 28, 2020
@voidzcy voidzcy changed the title Interfaces for supporting zero copy into Protobuf Interfaces for supporting zero copy into Protobuf (HasByteBuffer & Detachable) May 14, 2021
@voidzcy voidzcy reopened this May 14, 2021
@voidzcy voidzcy added this to the Unscheduled milestone May 18, 2021
@bruto0
Copy link

bruto0 commented Jul 29, 2021

This is a great step towards memory efficiency, @voidzcy, thank you

What seems to be missing is a way to transfer ownership of detached buffers to the parsed message itself so they can be released for that message only when application code is done with it (without having to modify protoc-generated classes). This would be particularly handy with reactive-grpc

With Marshaller holding indistinguishable references to all of them, from all invocations of RPC methods you can only release all of them at once it seems which is not particularly useful

@bruto0
Copy link

bruto0 commented Jul 31, 2021

is zero-copy serialization being discussed anywhere, btw?

@voidzcy
Copy link
Contributor Author

voidzcy commented Aug 6, 2021

With Marshaller holding indistinguishable references to all of them, from all invocations of RPC methods you can only release all of them at once it seems which is not particularly useful

You don't have to hold buffers until finishing all RPC invocations, you can release each time you finished using the current message. This is an advanced feature originally introduced to support some Google Cloud client usages, and we have not discussed or documented it publicly. If you are interested, you could look at the example usage in GoogleCloudPlatform/grpc-gcp-java#77.

@bruto1
Copy link

bruto1 commented Aug 6, 2021

that's a simple but effective trick I didn't think of (map and popStream()), thank you

@anuraaga
Copy link
Contributor

anuraaga commented Aug 19, 2021

is zero-copy serialization being discussed anywhere, btw?

I am interested in this too, especially in the context of using a CodedOutputStream which has some unnecessary buffering when writing into a OutputStream. I looked into what it might take, and unlike reading, writing is more complicated because the underlying's buffer indexes must be updated. This makes it not so simple as just returning a ByteBuffer to be written to. A natural API could be to return WritableBuffer itself but WritableBuffer isn't part of grpc-api, and doesn't solve the issue for proto which would have no way of writing to it without wrapping in a ByteOutput, which ends up being buffered just like an OutputStream. An interface that would work is something like write(Consumer<ByteBuffer>) - in that case the implementation can create a ByteBuffer, apply the consumer, then update the writer index based on the state of the bytebuffer after the write. I guess the API would be quite awkward in a Java 7 project though but other ideas don't come to mind.

Wondering if anyone else has thought of approaches for the serialization side. For reference, Armeria's able to use the fast path since it doesn't have buffer abstractions

https://github.com/line/armeria/blob/master/grpc/src/main/java/com/linecorp/armeria/internal/common/grpc/GrpcMessageMarshaller.java#L202

but for Netty, the pattern through gRPC's buffer abstractions would need to look the same, serialize into a ByteBuffer and update the writerIndex.

@ejona86
Copy link
Member

ejona86 commented Apr 19, 2023

@anuraaga, the writing path is discussed in the (long) #1054. Without reading through it too closely, I think this point might be a play to start reading: #1054 (comment)

API review:

  • Ready to stabilize
  • We've developed quite a random collection of interfaces that aren't all that discoverable. We could put some documentation in marshaller that references these interfaces

ejona86 added a commit to ejona86/grpc-java that referenced this issue Apr 19, 2023
ejona86 added a commit that referenced this issue Apr 19, 2023
@ejona86 ejona86 modified the milestones: Unscheduled, 1.55 Apr 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
experimental API Issue tracks stabilizing an experimental API
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants