Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Supranational BLS implementation #2453

Merged
merged 101 commits into from Jul 31, 2020
Merged

Conversation

Nashatyrev
Copy link
Contributor

@Nashatyrev Nashatyrev commented Jul 27, 2020

PR Description

Add Supranational BLS implementation based on highly optimized native BLS-12381 library

For now I forked the Blst repo and made some tweaks for the Java SWIG binding works correctly on all platforms. Will be working with Blst team to resolve them and push upstream.

The Java SWIG wrapper generated classes and manually built native libraries (Windows, Linux, MacOS) are now in the separate repository: https://github.com/Nashatyrev/jblst. Gradle bintrayUpload task may be used to build and publish the tech.pegasys:jblst artifact

By default the blst BLS implementation is loaded. If for some reason it fails to load native lib BLS falls back to Mikuli implementation (with a WARN log)

The PR was tested on: Windows, Linux and MacOS

Future tasks to do:

TODOs

  • Make common BLS tests cover both implementations
  • Remove pubkey/signature bytes store/lazy calc as it it now handled in common classes
  • Move artifact to official Pegasys Bintray org

Import Benchmarks

Before:

Benchmark                              (validatorsCount)  Mode  Cnt  Score   Error  Units
TransitionBenchmark.Block.importBlock              32768    ss  250  0,076 ± 0,002   s/op
TransitionBenchmark.Epoch.importBlock              32768    ss  100  0,458 ± 0,009   s/op

After:

Benchmark                              (validatorsCount)  Mode  Cnt  Score   Error  Units
TransitionBenchmark.Block.importBlock              32768    ss  250  0,021 ± 0,003   s/op
TransitionBenchmark.Epoch.importBlock              32768    ss  100  0,415 ± 0,012   s/op

BLS Benchmarks

Blst is x7-x9 faster than Mikuli:

Test SigCnt Mikuli Blst Speedup
verifySignatureBatchedNonParallelDoublePairing 1 36,177 318,327 8.8
verifySignatureBatchedNonParallelDoublePairing 2 26,333 206,738 7.9
verifySignatureBatchedNonParallelDoublePairing 4 16,908 117,974 7.0
verifySignatureBatchedNonParallelDoublePairing 8 9,324 66,379 7.1
verifySignatureBatchedNonParallelDoublePairing 16 5,379 35,010 6.5
verifySignatureBatchedNonParallelDoublePairing 32 2,768 18,101 6.5
verifySignatureBatchedNonParallelDoublePairing 64 1,442 8,892 6.2
verifySignatureBatchedNonParallelDoublePairing 128 726 4,654 6.4
verifySignatureBatchedNonParallelSinglePairing 1 36,713 324,845 8.8
verifySignatureBatchedNonParallelSinglePairing 2 25,561 203,667 8.0
verifySignatureBatchedNonParallelSinglePairing 4 16,004 123,699 7.7
verifySignatureBatchedNonParallelSinglePairing 8 8,663 67,762 7.8
verifySignatureBatchedNonParallelSinglePairing 16 4,644 35,188 7.6
verifySignatureBatchedNonParallelSinglePairing 32 2,607 18,015 6.9
verifySignatureBatchedNonParallelSinglePairing 64 1,337 8,908 6.7
verifySignatureBatchedNonParallelSinglePairing 128 677 4,607 6.8
verifySignatureBatchedParallelDoublePairing 1 36,081 311,249 8.6
verifySignatureBatchedParallelDoublePairing 2 26,631 207,796 7.8
verifySignatureBatchedParallelDoublePairing 4 23,819 199,297 8.4
verifySignatureBatchedParallelDoublePairing 8 20,292 151,721 7.5
verifySignatureBatchedParallelDoublePairing 16 14,747 106,536 7.2
verifySignatureBatchedParallelDoublePairing 32 8,419 60,879 7.2
verifySignatureBatchedParallelDoublePairing 64 4,466 33,660 7.5
verifySignatureBatchedParallelDoublePairing 128 2,477 16,651 6.7
verifySignatureBatchedParallelSinglePairing 1 35,797 320,506 9.0
verifySignatureBatchedParallelSinglePairing 2 31,836 300,686 9.4
verifySignatureBatchedParallelSinglePairing 4 27,452 243,981 8.9
verifySignatureBatchedParallelSinglePairing 8 22,340 187,294 8.4
verifySignatureBatchedParallelSinglePairing 16 14,204 106,880 7.5
verifySignatureBatchedParallelSinglePairing 32 8,375 64,067 7.6
verifySignatureBatchedParallelSinglePairing 64 4,551 32,898 7.2
verifySignatureBatchedParallelSinglePairing 128 2,342 16,482 7.0
verifySignatureSimple 1 51,861 357,217 6.9
verifySignatureSimple 2 25,967 174,044 6.7
verifySignatureSimple 4 12,868 87,326 6.8
verifySignatureSimple 8 6,423 44,326 6.9
verifySignatureSimple 16 3,259 22,209 6.8
verifySignatureSimple 32 1,649 10,938 6.6
verifySignatureSimple 64 816 5,581 6.8
verifySignatureSimple 128 400 2,775 6.9

Fixes

Fix #2300

Documentation

  • I thought about documentation and added the documentation label to this PR if updates are required.

…aces

# Conflicts:
#	bls/src/main/java/tech/pegasys/teku/bls/BLS.java
#	bls/src/main/java/tech/pegasys/teku/bls/BLSPublicKey.java
#	bls/src/test/java/tech/pegasys/teku/bls/BLSPublicKeyTest.java
#	data/provider/src/main/java/tech/pegasys/teku/api/schema/BLSPubKey.java
# Conflicts:
#	bls/build.gradle
#	bls/src/test/java/tech/pegasys/teku/bls/impl/mikuli/hash2g2/ReferenceTests.java
Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be keen for @benjaminion to review the BLS specifics as I'm not too familiar with that. Generally looks good otherwise.

The only concern I have is managing the native memory resources to ensure we aren't leaking them. I wonder if we should create a factory class that is the only place we create instances of the SWIG generated classes and automatically adds them to a Cleaner to ensure they are always cleaned up. Unfortunately you just can't depend on the finalized method being called even when the system is heavily under memory pressure.

bls/src/main/java/tech/pegasys/teku/bls/BLS.java Outdated Show resolved Hide resolved
Comment on lines 66 to 67
p2Signature.delete();
hash.delete();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's at all possible to have the SWIG bindings make these classes Autocloseable that would be awesome. In any case we should ensure they are deleted regardless of any potential exceptions or errors that occur so would have to use try/finally blocks. It will get really ugly without Autocloseable.

We're also winding up with a native memory leak because the wrapped BlstSignature returned has a reference to p2SignatureAffine but no way to close it and delete that native object. The finalize method isn't guaranteed to be called in any kind of timely manner so we can't really depend on it.

If we have to depend on automatic cleanup (which may well be the case for things like public keys and signatures that are used outside of the Blst specific code and currently expect to be fully on-heap), we should use java.lang.ref.Cleaner to do it which is the recommended instead of finalize.

Maybe use AutoCloseable if possible for the local variables we need to release and then use Cleaner for the things that are actually returned from this class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find Autocloseable feature in SWIG unfortunately. Indeed that would look better.
Revised places with temporary native wrapper instances and add try/catch/finally where appropriate: 913e484
I'll add a general comment later on my thoughts regarding finalize and cleaning native structs

@@ -169,6 +169,7 @@ downloadLicenses {
//JMH-Core is licensed under GPLv2 with the Classpath Exception, which allows us to link it and license the derived work under our license.
'org.openjdk.jmh:jmh-core:1.21': apache,
(group('io.libp2p')): apache,
(group('tech.pegasys')): apache,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably log a ticket to get the license information included in the blst native jar. I can't remember the magic incantation and its not a priority but something we should do when we setup a full CI for it and then we shouldn't need this.

@Nashatyrev
Copy link
Contributor Author

Nashatyrev commented Jul 30, 2020

@ajsutton I don't think that finalize in this concrete case is absolute evil.

Assuming that underlying native structs are of comparable size as their java wrapper objects they are most likely will be GCed at some point. If they are not GCed then it means we are fine with free RAM at the moment and the underlying allocated structures would make huge difference. So this not our case when large native memory chunks are referenced by small Java wrappers and GC is not aware of the real object allocations and don't hurry up collecting them. (as I remember we had the latter issue in Java2D library with direct byte buffers)

Here we have pretty simple logic behind finalize. It doesn't involve any interaction with other components as well it doesn't do any blocking operations as well it doesn't throw exceptions. So hitting any complex issue with blocking finalizer thread or leaving hanging object is highly unlikely.

As far as I understand the phrase 'finalize() is not guaranteed to be invoked' relates to JVM shutdown only. So e.g. correctly closing files is not recommended there. As I know there are also OSes which doesn't release all resources automatically when a process terminates, so releasing OS resources in finalize() is not recommended as well in such environment.

I see the following options at the moment:

  • finalize()
  • PhantomReference queue: I'm not sure if this would be somehow different from finalize approach in our simple case.
  • Manually release resources
  • Store raw native data in say byte[] and pass it to the native function. Andy from Blst was suggesting something like this. This is an interesting option though this would require additional efforts from both sides

BTW found this issue with similar discussion: google/or-tools#1040 So this Google lib also uses SWIG and finalize() to release native resources.

The above is just my vision and I'm not sure of course that these are all correct statements.

Copy link
Contributor

@benjaminion benjaminion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I flagged a few trivia, but on the basis of mine and Adrian's reviews I am happy to approve this. It's really good to have this done!

I built locally, and all was well. Running locally, I was happy to see

2020-07-30 18:09:48.850+01:00 | beaconchain-async-0 | INFO  | BlstBLS12381 | Successfully loaded native BLS library

and my sync speed was about 3x better than with the JVM BLS lib. 🙌

(Caveat: I'm leaving it to @Nashatyrev and @ajsutton to decide whether the finalize() things is settled. It looks like it has been thought-through in any case.)

Copy link
Contributor

@ajsutton ajsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still nervous about finalisers but agree that the implementations aren't going to cause deadlocks or other things - just likely that the native memory won't be released as fast as it probably should. As you say they're quite small amounts of memory so shouldn't be an issue.

So I'm happy to merge this.

@Nashatyrev Nashatyrev merged commit b3d2e07 into Consensys:master Jul 31, 2020
@Nashatyrev Nashatyrev deleted the feature-blst branch September 8, 2020 11:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinity pubkey and signature Integrate third-party BLS library
3 participants