Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation optimization #33

Merged
merged 4 commits into from
Jul 13, 2021
Merged

Conversation

jkanywhere
Copy link
Contributor

@jkanywhere jkanywhere commented Jul 6, 2021

Add benchmark coverage of ECDSA signing methods.

Reduce memory allocations for ECDSA signing by using big.Int.FillBytes instead of allocating and copying bytes directly.
FillBytes was added in go 1.15.

FillBytes sets buf to the absolute value of x, storing it as a zero-extended big-endian byte slice, and returns buf.

Use base64.RawURLEncoding to avoid having to add and remove = padding.

RawURLEncoding ... is the same as URLEncoding but omits padding characters.

Benchmark results

$ go test --bench=Bench --run=NONE .
goos: darwin
goarch: amd64
pkg: github.com/golang-jwt/jwt

Before optimizations, as of ff7295f:

BenchmarkECDSASigning/Basic_ES256-8                     178389      7499 ns/op    4249 B/op      65 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8            47665     26629 ns/op    3329 B/op      43 allocs/op
BenchmarkECDSASigning/basic_ES256_invalid:_foo_=>_bar-8 135355      7830 ns/op    4249 B/op      65 allocs/op
BenchmarkHS256Signing-8                                1000000      1369 ns/op    1584 B/op      32 allocs/op
BenchmarkHS384Signing-8                                 847502      1628 ns/op    1969 B/op      32 allocs/op
BenchmarkHS512Signing-8                                 772740      1599 ns/op    2065 B/op      32 allocs/op
BenchmarkRS256Signing-8                                   2834    414766 ns/op   32575 B/op     136 allocs/op
BenchmarkRS384Signing-8                                   2918    388103 ns/op   32688 B/op     136 allocs/op
BenchmarkRS512Signing-8                                   3238    371378 ns/op   32707 B/op     136 allocs/op

After optimizations:

BenchmarkECDSASigning/Basic_ES256-8                     153951      8351 ns/op    4021 B/op      58 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8            48044     25095 ns/op    3169 B/op      38 allocs/op
BenchmarkECDSASigning/basic_ES256_invalid:_foo_=>_bar-8 148740      9228 ns/op    4021 B/op      58 allocs/op
BenchmarkHS256Signing-8                                1000000      1167 ns/op    1488 B/op      29 allocs/op
BenchmarkHS384Signing-8                                 959800      1424 ns/op    1873 B/op      29 allocs/op
BenchmarkHS512Signing-8                                 902449      1603 ns/op    1969 B/op      29 allocs/op
BenchmarkRS256Signing-8                                   2731    443249 ns/op   32476 B/op     133 allocs/op
BenchmarkRS384Signing-8                                   2472    420263 ns/op   32587 B/op     133 allocs/op
BenchmarkRS512Signing-8                                   2840    409993 ns/op   32614 B/op     133 allocs/op

Output of BenchmarkECDSASigning with ES384 and ES512 are omitted because the
allocations change for each execution.

Add assertions to ensure ECDSA signing methods return valid signatures.

This is probably covered elsewhere as well, but putting it in
ecdsa_test.go makes it more obvious and easier to find.
@jkanywhere
Copy link
Contributor Author

Port of dgrijalva/jwt-go#455.

Copy link
Contributor

@amnonbc amnonbc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice optimisation.

Could you possibly add benchmark output to the PR comment, to demonstrate
the savings made by this change?

Copy link
Collaborator

@oxisto oxisto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for porting this! Overall it looks good, the padding line needs another look I think.

ecdsa.go Outdated Show resolved Hide resolved
amnonbc
amnonbc previously approved these changes Jul 6, 2021
@jkanywhere
Copy link
Contributor Author

This is a nice optimisation.

Thanks.

Could you possibly add benchmark output to the PR comment, to demonstrate
the savings made by this change?

Done.

Add benchmark coverage of ECDSA signing methods.

Benchmarks are run using the existing helper for comparability with
existing benchmarks.

Sign method is also tested directly, to avoid the overhead of *Token.

Report allocations for all benchmarks.

Allocation count for ES384 and ES512 fluctuate across test runs,
other signing methods consistently report the same number of allocations.

Sample output:
```
$ go test -bench=Bench -run=NONE .
2021/02/26 18:18:30 Listening...
goos: darwin
goarch: amd64
pkg: github.com/dgrijalva/jwt-go
BenchmarkECDSASigning/Basic_ES256-8                     190572      6702 ns/op    4249 B/op      65 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8            47383     24650 ns/op    3329 B/op      43 allocs/op
BenchmarkECDSASigning/Basic_ES384-8                       1113   1252975 ns/op 1750744 B/op   14474 allocs/op
BenchmarkECDSASigning/Basic_ES384/sign-only-8              286   3937773 ns/op 1746175 B/op   14423 allocs/op
BenchmarkECDSASigning/Basic_ES512-8                        662   1949937 ns/op 3028386 B/op   19608 allocs/op
BenchmarkECDSASigning/Basic_ES512/sign-only-8              170   6856189 ns/op 3025471 B/op   19571 allocs/op
BenchmarkECDSASigning/basic_ES256_invalid:_foo_=>_bar-8 190638      6665 ns/op    4249 B/op      65 allocs/op
BenchmarkHS256Signing-8                                1000000      1024 ns/op    1584 B/op      32 allocs/op
BenchmarkHS384Signing-8                                 917286      1447 ns/op    1969 B/op      32 allocs/op
BenchmarkHS512Signing-8                                 827744      1470 ns/op    2065 B/op      32 allocs/op
BenchmarkRS256Signing-8                                   3037    390077 ns/op   32576 B/op     136 allocs/op
BenchmarkRS384Signing-8                                   2976    379155 ns/op   32684 B/op     136 allocs/op
BenchmarkRS512Signing-8                                   3205    388628 ns/op   32704 B/op     136 allocs/op
```
Reduce the number of byte arrays allocated by using big.Int.FillBytes
when calculating ECDSA signature.

After this change, Benchmarks of ES256 signing method consistently
report 4 fewer allocations.

Before:
```
BenchmarkECDSASigning/Basic_ES256-8              190572         6702 ns/op       4249 B/op         65 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8     47383        24650 ns/op       3329 B/op         43 allocs/op
```

After:
```
BenchmarkECDSASigning/Basic_ES256-8              187682         6725 ns/op       4121 B/op         61 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8     48656        24446 ns/op       3201 B/op         39 allocs/op
```
JWT uses a non-padded base64 encoding.

Current code uses base64.URLEncoding to generate a padded string and
then removes the padding.
Likewise, current code adds padding before decoding.

Instead, use base64.RawURLEncoding which does not add or require the
padding in the first place.

In addition to making the code cleaner, this reduces memory allocations
as reported by benchmarks.

Before:
```
BenchmarkECDSASigning/Basic_ES256-8                     191396         6917 ns/op       4121 B/op         61 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8            49347        25039 ns/op       3201 B/op         39 allocs/op
BenchmarkECDSASigning/basic_ES256_invalid:_foo_=>_bar-8 190668         6586 ns/op       4121 B/op         61 allocs/op
BenchmarkHS256Signing-8                                1260060         1131 ns/op       1585 B/op         32 allocs/op
BenchmarkHS384Signing-8                                 861378         1387 ns/op       1969 B/op         32 allocs/op
BenchmarkHS512Signing-8                                 896745         1463 ns/op       2065 B/op         32 allocs/op
BenchmarkRS256Signing-8                                   3086       355769 ns/op      32576 B/op        136 allocs/op
BenchmarkRS384Signing-8                                   3414       353570 ns/op      32694 B/op        136 allocs/op
BenchmarkRS512Signing-8                                   3235       349394 ns/op      32706 B/op        136 allocs/op
```

After:
```
BenchmarkECDSASigning/Basic_ES256-8                     176617         6827 ns/op       4021 B/op         58 allocs/op
BenchmarkECDSASigning/Basic_ES256/sign-only-8            48038        24213 ns/op       3169 B/op         38 allocs/op
BenchmarkECDSASigning/basic_ES256_invalid:_foo_=>_bar-8 194352         6928 ns/op       4021 B/op         58 allocs/op
BenchmarkHS256Signing-8                                1000000         1127 ns/op       1488 B/op         29 allocs/op
BenchmarkHS384Signing-8                                 972552         1369 ns/op       1873 B/op         29 allocs/op
BenchmarkHS512Signing-8                                 780751         1368 ns/op       1969 B/op         29 allocs/op
BenchmarkRS256Signing-8                                   3014       387326 ns/op      32475 B/op        133 allocs/op
BenchmarkRS384Signing-8                                   3044       361411 ns/op      32591 B/op        133 allocs/op
BenchmarkRS512Signing-8                                   3273       355504 ns/op      32607 B/op        133 allocs/op
```

Benchmarks of signing methods ES384 and ES512 are omitted because their
allocations are not consistent.
@amnonbc
Copy link
Contributor

amnonbc commented Jul 10, 2021

Could you possibly add benchmark output to the PR comment, to demonstrate
the savings made by this change?

Done.

Cool...
Less allocations everywhere...
Some benchmarks actually get a bit slower....
Is this measurement noise?
I'll look at this when I get time.
But the change is good. I say let's land this!

Copy link
Collaborator

@oxisto oxisto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants