Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added jenkins.bouncycastle.prioritizeJceProvider system property #29

Merged
merged 3 commits into from Mar 5, 2021

Conversation

thomasgl-orange
Copy link
Contributor

@thomasgl-orange thomasgl-orange commented Jun 26, 2020

Note: this is an experiment, hence the draft PR. The big TODO item is more testing in real-world conditions. So far, I only have one Jenkins server with this change enabled (where it was required to solve some SSH connection issues with a SunOS server), so we'll see how it goes there, and then at some point I will canary-test with more instances. Then, if no regression is detected, I will make it a real PR. For now, I'm only sharing in order to, maybe, get early feedback (if anyone has an opinion on the opportunity of such a change).

Update on 2020/08/27: the experiment phase I was planning is over. I have actually deployed a patched bouncycastle-api plugin, and enabled -Djenkins.bouncycastle.prioritizeJceProvider=true, on a few hundreds of Jenkins installations, and got no negative feedback. Of course, this change is absolutely useless in most cases, but at least it was not harmful, which is what I was trying to verify. So I think this PR is safe and can now be considered for merging (keep in mind that it doesn't change anything anyway, as long as one doesn't set jenkins.bouncycastle.prioritizeJceProvider=true).


Added a new (optional) boolean system property (jenkins.bouncycastle.prioritizeJceProvider) which affects how BouncyCastle is registered as a JCE provider. When true, the provider is inserted at the beginning of the list (in 2nd position), instead of being added at the end.

It could help in situations where BC does a better job than the JDK. In particular, it should solve issues with SSH connections to some SunOS servers, which end up with this exception (with JSCH from the publish-over-ssh plugin):

java.security.InvalidAlgorithmParameterException: DH key size must be multiple of 64, and can only range from 512 to 4096 (inclusive). The specific key size 2047 is not supported

See https://stackoverflow.com/a/43121515, where this idea comes from. In this case, simply having BouncyCastle registered, but as the last provider, was not good enough to solve the issue.

Probably it could also have been a solution for JENKINS-35231.

The system property added here is expected to be set on Jenkins controller process. Registration on agents will follow the same preference, and not a property of the agent process.

Added a new (optional) boolean system property which affects how BouncyCastle
is registered as a JCE provider.  When `true`, the provider is inserted at the
beginning of the list (in 2nd position), instead of being added at the end.

It could help in situations where BC does a better job than the JDK. In
particular, it should solve issues with SSH connections to some SunOS servers,
which end up with this exception (with JSCH from the publish-over-ssh plugin):

```
java.security.InvalidAlgorithmParameterException: DH key size must be multiple
of 64, and can only range from 512 to 4096 (inclusive). The specific key size
2047 is not supported
```

See https://stackoverflow.com/a/43121515, where this idea comes from.
In this case, simply having BouncyCastle registered but as the last provider is
not good enough to solve the issue.

Probably it could have been a solution for
[JENKINS-35231](https://issues.jenkins-ci.org/browse/JENKINS-35231) too.

The system property added here is expected to be set on Jenkins main process.
Registration on agents will follow the same preference, and not a property of
the agent process.
@thomasgl-orange thomasgl-orange marked this pull request as ready for review August 27, 2020 13:35
@thomasgl-orange
Copy link
Contributor Author

Note: For the (unrelated) tests failure on Windows, see #30.

@thomasgl-orange
Copy link
Contributor Author

FTR, I've finally found a significant regression with jenkins.bouncycastle.prioritizeJceProvider=true. It currently breaks TCP inbound agents connections:

2020-11-04 09:29:53.135+0000 [id=83688]	INFO	h.TcpSlaveAgentListener$ConnectionHandler#run: Connection #629 failed: java.io.EOFException
2020-11-04 09:29:53.220+0000 [id=83689]	INFO	h.TcpSlaveAgentListener$ConnectionHandler#run: Accepted JNLP4-connect connection #630 from /10.x.x.x:63374
2020-11-04 09:29:53.254+0000 [id=83647]	WARNING	o.j.r.p.i.SSLEngineFilterLayer#onRecv: [JNLP4-connect connection from 10.x.x.x/10.x.x.x:63374] 
java.lang.NullPointerException
	at org.bouncycastle.crypto.signers.PSSSigner.generateSignature(Unknown Source)
	at org.bouncycastle.jcajce.provider.asymmetric.rsa.PSSSignatureSpi.engineSign(Unknown Source)
	at java.security.Signature$Delegate.engineSign(Signature.java:1382)
	at java.security.Signature.sign(Signature.java:698)
	at sun.security.ssl.ECDHServerKeyExchange$ECDHServerKeyExchangeMessage.<init>(ECDHServerKeyExchange.java:181)
	at sun.security.ssl.ECDHServerKeyExchange$ECDHServerKeyExchangeProducer.produce(ECDHServerKeyExchange.java:499)
	at sun.security.ssl.ClientHello$T12ClientHelloConsumer.consume(ClientHello.java:1020)
	at sun.security.ssl.ClientHello$ClientHelloConsumer.onClientHello(ClientHello.java:727)
	at sun.security.ssl.ClientHello$ClientHelloConsumer.consume(ClientHello.java:693)
	at sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:377)
	at sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:444)
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:968)
	at sun.security.ssl.SSLEngineImpl$DelegatedTask$DelegatedAction.run(SSLEngineImpl.java:955)
	at java.security.AccessController.doPrivileged(Native Method)
	at sun.security.ssl.SSLEngineImpl$DelegatedTask.run(SSLEngineImpl.java:902)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.processRead(SSLEngineFilterLayer.java:382)
	at org.jenkinsci.remoting.protocol.impl.SSLEngineFilterLayer.onRecv(SSLEngineFilterLayer.java:117)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:668)
	at org.jenkinsci.remoting.protocol.impl.AckFilterLayer.onRecv(AckFilterLayer.java:258)
	at org.jenkinsci.remoting.protocol.ProtocolStack$Ptr.onRecv(ProtocolStack.java:668)
	at org.jenkinsci.remoting.protocol.NetworkLayer.onRead(NetworkLayer.java:136)
	at org.jenkinsci.remoting.protocol.impl.NIONetworkLayer.ready(NIONetworkLayer.java:160)
	at org.jenkinsci.remoting.protocol.IOHub$OnReady.run(IOHub.java:795)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:59)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

It's something which seems to be fixed already in BC 1.65 (current bouncycastle-api-plugin version embeds 1.64): bcgit/bc-java#633

AFAIK, it happened only after upgrading OpenJDK from 8u265 to 8u272 (which is partly why I'm only seeing that now; the other part of the explanation being that we only have a handful of TCP Inbound agents on the Jenkins servers we manage). The BC issue mentions OpenJDK 11, but I guess the "offending" OpenJDK changes has been backported in 8u272 (see realease notes).

Anyway, this issue does not invalidate this PR I think. It mainly shows that:

  • it's a good thing that the proposed change was optional and disabled by default
  • it could be time to upgrade the Bouncycastle version in this plugin

Copy link
Contributor

@jeffret-b jeffret-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I've been bitten by the SunOS issue before on a previous project, so it's nice to see something that manage it. The implementation looks good. I'm glad to see the tests.

I consider this an experimental feature for now. That's pretty well indicated by the system property to enable it. Retaining the original behavior as default is good.

/**
* Utility class for registration of the BouncyCastle security provider.
*/
@Restricted(NoExternalUse.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the class isn't public this restriction shouldn't be necessary, but it doesn't hurt to be extra explicit about it.

@jeffret-b
Copy link
Contributor

Thanks for the submission! I didn't see any notification earlier so didn't realize this was here.

@jeffret-b
Copy link
Contributor

Let's try a rebuild with #30 merged in.

@jeffret-b jeffret-b closed this Mar 5, 2021
@jeffret-b jeffret-b reopened this Mar 5, 2021
@jeffret-b
Copy link
Contributor

Try again to see if it's a flaky infra issue. I'm not seeing the failure locally.

@jeffret-b jeffret-b closed this Mar 5, 2021
@jeffret-b jeffret-b reopened this Mar 5, 2021
@jeffret-b
Copy link
Contributor

Try building on top of other plugin update PR.

@jeffret-b jeffret-b closed this Mar 5, 2021
@jeffret-b jeffret-b reopened this Mar 5, 2021
@jeffret-b
Copy link
Contributor

Finally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants