Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc-js: Allow per-channel request compression from the client and decompression from the server #1952

Merged
merged 26 commits into from Nov 15, 2021

Conversation

b0b3rt
Copy link
Contributor

@b0b3rt b0b3rt commented Oct 25, 2021

This pull requests implements deflate and gzip compression on a per-channel basis (when instantiating a new client) and the corresponding decompression on the server side.

For the client, this is done through the addition of the grpc.default_compression_algorithm option to ChannelOptions, which is translated to the appropriate grpc-encoding header.

The server reads the grpc-encoding header and attempts the corresponding decompression, and fails with the status INTERNAL if the decompression fails, per the spec.

Support for this functionality has been requested in this issue.

I also recently found myself in a position where it'd be nice to have, so decided to give the whole open source thing a go :)

Other notes:

  • changes to package.json: when I first pulled down the repo and started running the setup scripts, they failed because a recent version of @types/node changed the type of Socket.localAddress from string to string?, causing compilation failures. If this change is actually a fix that reflects a pre-existing truth that localAddress could be missing/undefined at runtime, that could indicate that there's a fix required in Server.getChannelzSessionInfoGetter and Subchannel.getChannelzSocketInfo, but that'd be better addressed by a separate PR. The build similarly complained about not having @types/semver installed.
  • newly generated TestService files - these are used as fixtures for the tests I added to cover the new functionality, similar to how the files generated for channelz.proto are used in test-channels.ts.
  • usage of message flags - I'm not totally clear on the original motivation for using flags passed down through the message context to decide whether compression occurs (in compression-filter.ts). As far as I could determine, there isn't any way for a user to pass down flags when sending a message, so trying to use them to enable/disable compression would require strictly more work than just checking the ChannelOptions. Happy to modify this if I'm missing something.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 25, 2021

CLA Signed

The committers are authorized under a signed CLA.

Copy link
Member

@murgatroid99 murgatroid99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this. It would be great to have this implemented.

However, some things need to be changed:

First, please do not put test code in src/ or proto/. test_service.proto already exists in test/fixtures/, and it shouldn't move from there. The corresponding generated code can go in test/generated/. This will require a new script line in package.json to generate it. channelz.proto and the corresponding generated code are used in src/channelz.ts, which is why they belong in proto/ and src/ respectively.

Second, now that the client can send using different compression methods, it should be handling the server's grpc-accept-encoding header. If the server does not accept the encoding the client said it was using, the client should disable compression on all messages it sends after receiving that header.

Third, the purpose of the NoCompress write flag is to disable compression on a per-message basis, overriding what the channel does overall. So, it still needs to be taken into account. Write flags can currently be set, at least for streaming calls. The user passes it here, in the encoding parameter of call.write, where it becomes context.flags. Then context.flags becomes writeObj.flags here.

Fourth, it would probably be helpful to export CompressionAlgorithms, either as currently written or as an enum, so that users can use it to more easily determine the value to pass to the new channel option.

packages/grpc-js/src/compression-filter.ts Outdated Show resolved Hide resolved
packages/grpc-js/src/compression-filter.ts Outdated Show resolved Hide resolved
packages/grpc-js/src/compression-filter.ts Outdated Show resolved Hide resolved
@@ -32,6 +33,7 @@ import { StreamDecoder } from './stream-decoder';
import { ObjectReadable, ObjectWritable } from './object-stream';
import { ChannelOptions } from './channel-options';
import * as logging from './logging';
import { MetadataValue } from '.';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no real reason to represent the encoding as a MetadataValue through the code here. The header key doesn't end with -bin, so the value is just a string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetadataValue is just what's returned from calling .get when we initially pull out the encoding:
const encoding: MetadataValue | undefined = metadata.get('grpc-encoding')[0];

I can coerce it to a string if that'll reduce potential confusion, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could narrow the actual return type in the Metadata definition using string literal types, but that would require a big jump in the minimum compiler version needed to use this library, so I'm not yet sure it's a good idea. In the meantime, I think this is a valid use of type coercion: we know that certain properties of the string guarantee certain narrower types, and the compiler just isn't aware of that.

So, in short, yes, please coerce the encoding to a string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, changed - I suppose it's also part of the spec that the grpc-encoding header needs to be a string value.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 26, 2021

CLA Signed

The committers are authorized under a signed CLA.

@@ -36,6 +36,7 @@ export interface ChannelOptions {
'grpc.enable_http_proxy'?: number;
'grpc.http_connect_target'?: string;
'grpc.http_connect_creds'?: string;
'grpc.default_compression_algorithm'?: number;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@murgatroid99 what do you think of changing this from number to keyof typeof CompressionAlgorithms (resulting in 0 | 1 | 2)? That'll probably be more legible to consumers of the library than just exporting CompressionAlgorithms.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done. I also moved CompressionAlgorithms into its own file to avoid a circular dependency.

@b0b3rt
Copy link
Contributor Author

b0b3rt commented Oct 26, 2021

Thank you for working on this. It would be great to have this implemented.

However, some things need to be changed:

First, please do not put test code in src/ or proto/. test_service.proto already exists in test/fixtures/, and it shouldn't move from there. The corresponding generated code can go in test/generated/. This will require a new script line in package.json to generate it. channelz.proto and the corresponding generated code are used in src/channelz.ts, which is why they belong in proto/ and src/ respectively.

Fixed :) - let me know if the new command is what you were thinking of, and whether it also needs to be called in prepare and not just pretest.

Second, now that the client can send using different compression methods, it should be handling the server's grpc-accept-encoding header. If the server does not accept the encoding the client said it was using, the client should disable compression on all messages it sends after receiving that header.

Implemented here. I'm not sure I see a way to do this only when the server returns UNIMPLEMENTED along with the grpc-accept-encoding header, but if there's a relatively clean way to avoid performing this check on every response received by the client let me know.

Third, the purpose of the NoCompress write flag is to disable compression on a per-message basis, overriding what the channel does overall. So, it still needs to be taken into account. Write flags can currently be set, at least for streaming calls. The user passes it here, in the encoding parameter of call.write, where it becomes context.flags. Then context.flags becomes writeObj.flags here.

Fixed, although it seems to be passed down through _write, not write (which afaict calls the base method from Http2CallStream, where it expects an encoding of type "ascii" | "utf8" | "utf-8" | ..., rather than being handled by gRPC-specific logic which understands WriteFlags). Let me know if I've missed something here.

Fourth, it would probably be helpful to export CompressionAlgorithms, either as currently written or as an enum, so that users can use it to more easily determine the value to pass to the new channel option.

Done, but also see my suggestion here.

@murgatroid99
Copy link
Member

Thank you for working on this. It would be great to have this implemented.
However, some things need to be changed:
First, please do not put test code in src/ or proto/. test_service.proto already exists in test/fixtures/, and it shouldn't move from there. The corresponding generated code can go in test/generated/. This will require a new script line in package.json to generate it. channelz.proto and the corresponding generated code are used in src/channelz.ts, which is why they belong in proto/ and src/ respectively.

Fixed :) - let me know if the new command is what you were thinking of, and whether it also needs to be called in prepare and not just pretest.

It looks fine. It doesn't need to be in prepare because we don't want the test code packed up or published.

Second, now that the client can send using different compression methods, it should be handling the server's grpc-accept-encoding header. If the server does not accept the encoding the client said it was using, the client should disable compression on all messages it sends after receiving that header.

Implemented here. I'm not sure I see a way to do this only when the server returns UNIMPLEMENTED along with the grpc-accept-encoding header, but if there's a relatively clean way to avoid performing this check on every response received by the client let me know.

Probably the best way to track this information across requests would be for the CompressionFilterFactory to own an object shared among all of the CompressionFilter instances that tracks the latest-received grpc-accept-encoding header. I don't think handling errors is necessary here. The filter can just do its best to only send messages that the server will accept.

Third, the purpose of the NoCompress write flag is to disable compression on a per-message basis, overriding what the channel does overall. So, it still needs to be taken into account. Write flags can currently be set, at least for streaming calls. The user passes it here, in the encoding parameter of call.write, where it becomes context.flags. Then context.flags becomes writeObj.flags here.

Fixed, although it seems to be passed down through _write, not write (which afaict calls the base method from Http2CallStream, where it expects an encoding of type "ascii" | "utf8" | "utf-8" | ..., rather than being handled by gRPC-specific logic which understands WriteFlags). Let me know if I've missed something here.

This is kind of a tangent, but these flags are getting passed in a couple of layers up from Http2CallStream, and _write is the internal override for write, so it will be called automatically when you call write. The types don't match, but if you forcibly pass in a number in the encoding argument of write, _write will see it and use it as the write flags.

@b0b3rt
Copy link
Contributor Author

b0b3rt commented Oct 26, 2021

This is kind of a tangent, but these flags are getting passed in a couple of layers up from Http2CallStream, and _write is the internal override for write, so it will be called automatically when you call write. The types don't match, but if you forcibly pass in a number in the encoding argument of write, _write will see it and use it as the write flags.

Oh, I see. Since that's the case, I changed the test to call write with that flag, instead of _write.

I think this addresses all the outstanding feedback - my only open questions are around @types/node (is pinning the version ok, potentially pending a follow-up PR to address the type incompatibility introduced in more recent versions?) and adding @types/semver (not clear why it was refusing to build on my machine without it - I can try removing it for the PR's workflow tests, if you want).

@murgatroid99
Copy link
Member

I actually fixed the @types/node issue in #1953, so you can remove that change. The issue with @types/semver is the result of this repository's weird layout. You might as well just keep your change for that in.

@b0b3rt
Copy link
Contributor Author

b0b3rt commented Oct 26, 2021

I actually fixed the @types/node issue in #1953, so you can remove that change. The issue with @types/semver is the result of this repository's weird layout. You might as well just keep your change for that in.

Cool, reverted.

@b0b3rt
Copy link
Contributor Author

b0b3rt commented Oct 28, 2021

@murgatroid99 I think I accidentally cancelled the test run by pushing a commit reverting the @types/node change. Are there any other changes needed before we can kick it off again?

Copy link
Member

@murgatroid99 murgatroid99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall structure and logic of this change look good now, but there are a few specific things that I think need to be changed.

@@ -167,10 +183,44 @@ function getCompressionHandler(compressionName: string): CompressionHandler {
export class CompressionFilter extends BaseFilter implements Filter {
private sendCompression: CompressionHandler = new IdentityHandler();
private receiveCompression: CompressionHandler = new IdentityHandler();
private defaultCompressionAlgorithm: CompressionAlgorithm | undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point this field would more accurately be called currentCompressionAlgorithm, since it's always set to the currently used algorithm, not the default.

Also, please remove | undefined from the type, and use "identity" instead of undefined as the default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

async sendMetadata(metadata: Promise<Metadata>): Promise<Metadata> {
const headers: Metadata = await metadata;
headers.set('grpc-accept-encoding', 'identity,deflate,gzip');
headers.set('accept-encoding', 'identity');

if (this.defaultCompressionAlgorithm && ['deflate', 'gzip'].includes(this.defaultCompressionAlgorithm)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the explicit check for "deflate" and "gzip" here. It's effectively duplicating information in the CompressionAlgorithm type and CompressionAlgorithms object. It's perfectly valid to not have a check at all here, and to send "identity` when that is the value. Particularly with the other change, that would allow this code to have no check at all. Alternatively, I would be OK with explicitly excluding "identity", because that is the actual special case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call, there's no reason to have this check anymore. I think explicitly excluding identity does make sense, if only to save on the bandwidth.

@@ -194,7 +257,7 @@ export class CompressionFilter extends BaseFilter implements Filter {
const resolvedMessage: WriteObject = await message;
const compress =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this assignment needs to be expanded out at this point. It's three conditionals and too many negatives, and I think the ordering is a little confusing. I think it would be better for the handler check to be the primary thing, and separate from the flags check. So, something like this:

if (this.sendCompression instanceof IdentityHandler) {
  compress = false;
} else {
  compress = ((resolvedMessage.flags ?? 0) & WriteFlags.NoCompress) === 0;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit too much for a ternary expression. Split it out as suggested.

@@ -136,12 +137,13 @@ export class ServerReadableStreamImpl<RequestType, ResponseType>
constructor(
private call: Http2ServerCallStream<RequestType, ResponseType>,
public metadata: Metadata,
public deserialize: Deserialize<RequestType>
public deserialize: Deserialize<RequestType>,
encoding?: string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the client-side filter code, throughout this file, encoding should be just a string, with "identity" used as the value indicating that there is no compression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, went ahead and made that consistent everywhere.

});
}

default:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a separate "identity" case here that does not modify the message, and then the default case will be for unrecognized compression algorithms, and that should end the call with an UNIMPLEMENTED error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

*
*/

export const CompressionAlgorithms = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of using const here instead of enum? I think an enum would be better for helping the user populate the channel option, because they could use CompressionAlgorithm.gzip, for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I disprefer enums in Typescript because they're one of the few language features which aren't runtime-only, and they interact somewhat poorly with the rest of the type system, since they're somewhat poorly supported and the language designers seem to pushing people in the direction of using object literals (with as const) instead.

Those probably aren't huge concerns in this case; I don't see a way that making it an enum would frustrate a typical user of the library. For user convenience and consistency with the way other such options are handled, I'll change it to an enum.

const serverSupportedEncodings = serverSupportedEncodingsHeader.split(',');

if ((this.sendCompression instanceof DeflateHandler && !serverSupportedEncodings.includes('deflate'))
|| (this.sendCompression instanceof GzipHandler && !serverSupportedEncodings.includes('gzip'))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole if statement can be reduced to if (!serverSupportedEncodings.includes(this.currentCompressionAlgorithm)), and it should also set this.currentCompressionAlgorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, simplified (and added).

`localhost:${assignedPort}`,
grpc.credentials.createInsecure(),
{
'grpc.default_compression_algorithm': 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test code should use the CompressionAlgorithms enum when setting this option, both for clarity and to demonstrate that the enum works as intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks.

@b0b3rt
Copy link
Contributor Author

b0b3rt commented Nov 5, 2021

Hm, looks like the tests are failing:

Error: ENOENT: no such file or directory, open 'test/fixtures/test_service.proto'
...
    at Object.loadSync (/tmpfs/src/github/grpc-node/packages/grpc-js/node_modules/@grpc/proto-loader/build/src/index.js:194:31)
    at Object.<anonymous> (/tmpfs/src/github/grpc-node/packages/grpc-js/test/test-server.ts:38:44)

which is
const loadedTestServiceProto = protoLoader.loadSync('test/fixtures/test_service.proto', {

Since this passes locally but not in the test runner, does the test runner have something other than grpc-node/packages/grpc-js as the root dir when running the test?

@murgatroid99
Copy link
Member

It looks like the problem is that you didn't update the includeDirs list when you moved the proto file. The path that you load needs to be an absolute path or relative to one in that list. The fixtures directory gets copied over to the build directory, so you can just load ${__dirname}/fixtures/test_service.proto without any include path.

@b0b3rt
Copy link
Contributor Author

b0b3rt commented Nov 6, 2021

Got it, changed the path accordingly.

Comment on lines 445 to 493
private getDecompressedMessage(message: Buffer, encoding: string) {
switch (encoding) {
case 'deflate': {
return new Promise<Buffer | undefined>((resolve, reject) => {
zlib.inflate(message.slice(5), (err, output) => {
if (err) {
this.sendError({
code: Status.INTERNAL,
details: `Received "grpc-encoding" header "${encoding}" but ${encoding} decompression failed`,
});
resolve();
} else {
const joined = Buffer.concat([message.slice(0, 5), output]);
resolve(joined);
}
});
});
}

case 'gzip': {
return new Promise<Buffer | undefined>((resolve, reject) => {
zlib.unzip(message.slice(5), (err, output) => {
if (err) {
this.sendError({
code: Status.INTERNAL,
details: `Received "grpc-encoding" header "${encoding}" but ${encoding} decompression failed`,
});
resolve();
} else {
const joined = Buffer.concat([message.slice(0, 5), output]);
resolve(joined);
}
});
});
}

case 'identity': {
return Promise.resolve(message);
}

default: {
this.sendError({
code: Status.UNIMPLEMENTED,
details: `Received message compressed with unsupported encoding "${encoding}"`,
});
return Promise.resolve();
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice this before, but this function has to actually remove the 5-byte chunk at the beginning of the message. That chunk contains the compressed flag and the length of the encoded message, which are both invalid and useless after the message has been decompressed. Then deserializeMessage also has to be modified to not slice off the first 5 bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@murgatroid99 It does remove the 5-byte chunk before decompressing it; do you mean to avoid re-attaching it before returning it?

              const joined = Buffer.concat([message.slice(0, 5), output]);
              resolve(joined);

Which would obviate the need to remove it again later:

  deserializeMessage(bytes: Buffer) {
    // TODO(cjihrig): Call compression aware deserializeMessage().
    const receivedMessage = bytes.slice(5);

    return this.handler.deserialize(receivedMessage);
  }

If that's what you meant, I'll make that change.

On a separate note, I'm not able to reproduce the test failures in Linux Tests. When I run the tests locally (on Ubuntu 20.04.1 via WSL2), the following two tests fail:

  1) Name Resolver
       DNS Names
         Should resolve localhost properly:
     Error: Timeout of 4000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves.

  2) Name Resolver
       DNS Names
         Should default to port 443:
     Error: Timeout of 4000ms exceeded. For async tests and hooks, ensure "done()" is called; if returning a Promise, ensure it resolves.

However, those tests were failing from the moment I pulled down the repo; I assumed it was a quirk of the local environment (or some minor dependency mismatch somewhere).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant to remove the 5 byte header from the output of that function, and the other code you referenced that removes it again.

Don't worry about those test failures. I see the same failures on WSL; it seems to be a quirk of how Windows or WSL resolves localhost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, pushed a commit that simplifies the removal of the "prefix".

Copy link
Member

@murgatroid99 murgatroid99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution, and for working with me to improve this PR.

@murgatroid99 murgatroid99 merged commit fe58061 into grpc:master Nov 15, 2021
@desytech
Copy link

@murgatroid99 is there any approximate release date for the compression support? Thanks

@murgatroid99
Copy link
Member

I will probably be able to publish this in early January after the holidays.

@murgatroid99
Copy link
Member

This has been published in version 1.5.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants