Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Premature end of chunk coded message body after upgrading to 3.1.3 #3409

Closed
rmaclean-ee opened this issue Aug 12, 2021 · 40 comments
Closed

Premature end of chunk coded message body after upgrading to 3.1.3 #3409

rmaclean-ee opened this issue Aug 12, 2021 · 40 comments

Comments

@rmaclean-ee
Copy link

rmaclean-ee commented Aug 12, 2021

Environment:

  • Jib version: 3.1.3
  • Build tool: Gradle 7.1 (using jib plugin 3.1.3)
  • OS: Linux via Gitlab pipelines

Description of the issue:
Upgraded from 3.1.2 to 3.1.3 and now getting an error when pulling an image from GCR

I/O error for image [eu.gcr.io/***]:
    org.apache.http.ConnectionClosedException
    Premature end of chunk coded message body: closing chunk expected
I/O error for image [eu.gcr.io/***]:
    org.apache.http.ConnectionClosedException
    Premature end of chunk coded message body: closing chunk expected

This causes it to completely fail

Expected behavior:

It would still work the same from 3.1.2

Steps to reproduce:

  1. Upgrade to 3.1.3 and it breaks

jib-gradle-plugin Configuration:

jib {
    from {
        image = "eu.gcr.io/***"
    }
    to {
        image = System.getenv("CONTAINER_IMAGE") ?: "***"
        tags = ["latest", System.getenv("CI_PIPELINE_ID") ?: "dev", System.getenv("CI_COMMIT_SHA") ?: "local"]
    }
    container {
        mainClass = "com.***.Application"
        args = ["server", System.getenv("DROPWIZARD_CONFIG") ?: "development"]
        jvmFlags = [
            "-XX:+UseG1GC",
            "-XX:+UseStringDeduplication",
            "-Dcontrast.dir=/tmp/contrast",
            "-javaagent:contrast.jar",
            "-Dcontrast.assess.sampling.enable=true",
            "-Dcontrast.assess.sampling.baseline=1",
            "-Dcontrast.assess.sampling.request_frequency=1200",
            "-Dcontrast.assess.sampling.window_ms=30000",
        ]
        ports = ["8080", "8081"]
    }
}

Log output:

Containerizing application to eu.gcr.io/***, eu.gcr.io/***, eu.gcr.io/***, eu.gcr.io/***...
Base image 'eu.gcr.io/***' does not use a specific image digest - build may not be reproducible
Getting manifest for base image eu.gcr.io/***...
Building dependencies layer...
Building project dependencies layer...
Building classes layer...
Building jvm arg files layer...
Using Google Application Default Credentials for eu.gcr.io/***
The base image requires auth. Trying again for eu.gcr.io/***...
Using Google Application Default Credentials for eu.gcr.io/***
I/O error for image [eu.gcr.io/***]:
    org.apache.http.ConnectionClosedException
    Premature end of chunk coded message body: closing chunk expected
I/O error for image [eu.gcr.io/***]:
    org.apache.http.ConnectionClosedException
    Premature end of chunk coded message body: closing chunk expected
Using base image with digest: ***
Container entrypoint set to [java, -Xms4g, -Xmx8g, -XX:+UseG1GC, -cp, @/app/jib-classpath-file, ***]
I/O error for image [eu.gcr.io/***]:
    org.apache.http.ConnectionClosedException
    Premature end of chunk coded message body: closing chunk expected
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':maintenance:jib'.
> com.google.cloud.tools.jib.plugins.common.BuildStepsExecutionException: Premature end of chunk coded message body: closing chunk expected
@chanseokoh
Copy link
Member

Maybe this is because we upgraded Google HTTP Client library. #3387 😫

@chanseokoh
Copy link
Member

chanseokoh commented Aug 12, 2021

@rmaclean-ee @abithgs @creekpld I can't reproduce it. Can anyone give us detailed HTTP logs by following these instructions also with --debug --stacktrace?

And is this consistently reproducible, remotely and locally?

@rmaclean-ee
Copy link
Author

is this consistently reproducible
yes

remotely and locally
can you expand what you mean by these?

@chanseokoh
Copy link
Member

You said this is on a GitLab pipeline, so I was referring to that as remote. "Locally" means running Jib on your machine to connect to GCR.

@ChristianCiach
Copy link
Contributor

@chanseokoh I can provide you with the logs, but these are huge and contain tons of sensitive information. Even if I try to censor the most sensitive data, I am afraid it may still contain too much to share publicly.

@chanseokoh
Copy link
Member

chanseokoh commented Aug 12, 2021

@ChristianCiach it should be enough to give us the portion around the failure point. (If needed, I can ask more.) Then it'd be easy to scrub sensitive information?

Oh, but the stacktrace you get with --stacktrace is important.

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

jib.log

I did my best :)

We are using a local docker-hub mirror. The hostname of the mirror has been scrubbed to dockerhub-mirror.local in the logs.

Jib has been configured to use this mirror by using the file ~/.config/google-cloud-tools-java/jib/config.json with the content:

{
  "registryMirrors": [
    {
      "registry": "registry-1.docker.io",
      "mirrors": ["dockerhub-mirror.local:5000", "mirror.gcr.io"]
    }
  ]
}

The issue only happens with the current Jib versions, locally (on the workstations) and on privately hosted Jenkins/GitLab.

Edit: We use a Harbor registry as a docker-hub mirror. Maybe the issue lies in Harbor, but earlier versions of Google HTTPClient were more tolerant?

Edit: On closer inspection, the error seems to happen when connecting to the target registry, which is the registry Jib is supposed to push the built image to. This registry is called target-registry.local in the log. The target registry is a completely vanilla registry:2 registry, with a nginx:mainline container as a reverse proxy for access control.

@chanseokoh
Copy link
Member

chanseokoh commented Aug 12, 2021

@ChristianCiach did you supply -Djib.serialize=true? Things still look concurrent, and that confuses me a bit.

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

@chanseokoh Executed command was:

./gradlew --no-daemon --console=plain -Djava.util.logging.config.file=$PWD/logging.properties -Djib.serialize=true :application:jib --image target-registry.local/project/repo:latest --debug --stacktrace > jib.log  2>&1

With the logging.properties as described in https://github.com/GoogleContainerTools/jib/blob/master/docs/faq.md#how-can-i-examine-network-traffic.

The log looks reasonable to me. I can reproduce it anytime if you need me to change anything.

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

I just notice that my issue doesn't seem to be exactly the same as described in this github issue. Nowhere in my logs does it say Premature end of chunk coded message body: closing chunk expected. Instead I get Premature end of Content-Length delimited message body (expected: 2; received: 0).

But I guess it's probably caused by the same underlying issue.

@chanseokoh
Copy link
Member

chanseokoh commented Aug 12, 2021

Okay, it helped to confirm the command. I can explain why it failed. It's just that I'm trying to track down why this started to happen in the newer HTTP library version.

Jib contacts https://target-registry.local/v2/, and the server (nginx, the reverse proxy) responds with 200.

2021-08-12T17:52:31.228+0200 [INFO] [com.google.api.client.http.HttpTransport] -------------- REQUEST  --------------
GET https://target-registry.local/v2/
Accept:
Accept-Encoding: gzip
User-Agent: jib 3.1.3 jib-gradle-plugin Google-HTTP-Java-Client/1.39.2 (gzip)

...
2021-08-12T17:52:31.377+0200 [INFO] [com.google.api.client.http.HttpTransport] -------------- RESPONSE --------------
HTTP/1.1 200 OK
Date: Thu, 12 Aug 2021 15:52:31 GMT
Server: nginx
Content-Type: application/json; charset=utf-8
Content-Length: 2
Docker-Distribution-Api-Version: registry/2.0
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive

What's unusual is that, it says that the length of the body the server will return is 2 (Content-Length: 2). However, there is no content returned by the server. The Apache library tries to consume 2 bytes of the input stream when closing (to discard the bytes), but since the other end of the stream is already closed, it's reporting the error (expected: 2; received: 0).

So basically it's the fault of the server, whether it's the reverse proxy or the target registry behind it. Content-Length: 2 doesn't really make sense. I wonder what's the difference when you use 3.1.2. (I assume this issue started to happen with the latest 3.1.3?)

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

That's right, the issue only happens with 3.1.3. Everything works fine with 3.1.2.

I pretty much have come to the same conclusions as you have. Which got me thinking: Does the target-registry-server also reports Content-Length: 2 when using the previous version of Jib? Turns out, it does! But Jib 3.1.2 doesn't seem to care.

...

After I've written the previous paragraghs, I've tried the curl commands directly on the shell. Turns out, the server is right to report a Content-Length of 2, because the response body is {}.

See:

curl -v --compressed -H 'Accept: ' -H 'Accept-Encoding: gzip' -H 'User-Agent: jib 3.1.2 jib-gradle-plugin Google-HTTP-Java-Client/1.38.1 (gzip)' -- 'https://target-registry.local/v2/'
* Uses proxy env variable no_proxy == 'localhost,127.0.0.0/8,::1'
*   Trying 172.28.4.68:443...
* Connected to target-registry.local (172.28.4.68) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: O=[COMPANY INFO]
*  start date: Apr 30 07:39:00 2021 GMT
*  expire date: Apr 30 07:39:00 2022 GMT
*  subjectAltName: host "target-registry.local" matched cert's "target-registry.local"
*  issuer: CN=[COMPANY INFO]
*  SSL certificate verify ok.
> GET /v2/ HTTP/1.1
> Host: target-registry.local
> Accept-Encoding: gzip
> User-Agent: jib 3.1.2 jib-gradle-plugin Google-HTTP-Java-Client/1.38.1 (gzip)
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Thu, 12 Aug 2021 18:53:34 GMT
< Server: nginx
< Content-Type: application/json; charset=utf-8
< Content-Length: 2
< Docker-Distribution-Api-Version: registry/2.0
< X-Content-Type-Options: nosniff
< 
* Connection #0 to host target-registry.local left intact
{}⏎ 

@chanseokoh
Copy link
Member

So do you see {} logged when using Jib 3.1.2?

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

I do not:

2021-08-12T20:48:31.405+0200 [INFO] [com.google.api.client.http.HttpTransport] -------------- REQUEST  --------------
GET https://target-registry.local/v2/
Accept: 
Accept-Encoding: gzip
User-Agent: jib 3.1.2 jib-gradle-plugin Google-HTTP-Java-Client/1.38.1 (gzip)

2021-08-12T20:48:31.405+0200 [INFO] [com.google.api.client.http.HttpTransport] curl -v --compressed -H 'Accept: ' -H 'Accept-Encoding: gzip' -H 'User-Agent: jib 3.1.2 jib-gradle-plugin Google-HTTP-Java-Client/1.38.1 (gzip)' -- 'https://target-registry.local/v2/'
2021-08-12T20:48:31.494+0200 [DEBUG] [jdk.event.security] X509Certificate: Alg:SHA384withRSA, Serial:5709f6b57d287739a0f3fd4b71a820ab68fc0de5, Subject:CN=target-registry.local, O=company (s-company-ems003), Issuer:CN=company (CAI s-company-ems003), Key type:RSA, Length:4096, Cert Id:-881398848, Valid from:30.04.21, 09:39, Valid until:30.04.22, 09:39
2021-08-12T20:48:31.514+0200 [DEBUG] [jdk.event.security] ValidationChain: 1932008468, -881398848
2021-08-12T20:48:31.543+0200 [DEBUG] [jdk.event.security]  TLSHandshake: target-registry.local:443, TLSv1.2, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, -881398848
2021-08-12T20:48:31.556+0200 [INFO] [com.google.api.client.http.HttpTransport] -------------- RESPONSE --------------
HTTP/1.1 200 OK
Date: Thu, 12 Aug 2021 18:48:31 GMT
Server: nginx
Content-Type: application/json; charset=utf-8
Content-Length: 2
Docker-Distribution-Api-Version: registry/2.0
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive

2021-08-12T20:48:31.558+0200 [DEBUG] [org.gradle.api.Task] server requires basic auth for target-registry.local/project/repo
2021-08-12T20:48:31.558+0200 [DEBUG] [org.gradle.api.Task] TIMED	Authenticating push to target-registry.local : 349.739 ms
2021-08-12T20:48:31.560+0200 [DEBUG] [org.gradle.api.Task] TIMING	Pulling base image manifest
2021-08-12T20:48:31.560+0200 [LIFECYCLE] [org.gradle.api.Task] Getting manifest for base image amazoncorretto:16...
2021-08-12T20:48:31.561+0200 [DEBUG] [org.gradle.api.Task] TIMING	trying mirrors
2021-08-12T20:48:31.561+0200 [DEBUG] [org.gradle.api.Task] mirror config: registry-1.docker.io --> dockerhub-mirror.local:5000
2021-08-12T20:48:31.561+0200 [INFO] [org.gradle.api.Task] trying mirror dockerhub-mirror.local:5000 for the base image
2021-08-12T20:48:31.563+0200 [INFO] [com.google.api.client.http.HttpTransport] -------------- REQUEST  --------------
GET https://dockerhub-mirror.local:5000/v2/library/amazoncorretto/manifests/16

@chanseokoh
Copy link
Member

chanseokoh commented Aug 12, 2021

Thanks a lot for debugging. I don't see any change in the newer Apache HTTP libraries, so I wonder if this is due to some changes in Google HTTP libraries. Using the method here, you'll be able to force the HTTP libraries versions in 3.1.2 (described as 2.8.0 dependencies in the link). Can you force the old versions and see if it works?

@chanseokoh
Copy link
Member

chanseokoh commented Aug 12, 2021

I do not:

In any case, not seeing {} is the root cause, and I think that should be fixed eventually. Maybe the reverse proxy is not working correctly?

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

Using Jib 3.1.3 and the "2.8.0" dependencies does work without issues.

In any case, not seeing {} is the root cause, and I think that should be fixed eventually. Maybe the reverse proxy is not working correctly?

I don't think this is the case. As I said: When I execute the exact same curl command that Jib logs as the REQUEST, I can see the {} just fine. So why can't jib?

Edit: The {} are also invisible when using Jib 3.1.2 and it still does work.

@chanseokoh
Copy link
Member

So why can't jib?

Ah, sorry. I mistakenly thought the curl was against the target registry and not through the proxy. What you said makes sense. Something definitely happened in the upgraded HTTP libraries. I also wonder why {} isn't logged in both 3.1.2 and 3.1.3.

@ChristianCiach
Copy link
Contributor

Works with

        classpath('com.google.http-client:google-http-client:1.39.0') { force = true }
        classpath('com.google.http-client:google-http-client-apache-v2:1.39.0') { force = true }

Does not work with

        classpath('com.google.http-client:google-http-client:1.39.1') { force = true }
        classpath('com.google.http-client:google-http-client-apache-v2:1.39.1') { force = true }

That should narrow it down...

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

I kinda suspect this one to be the culprit: googleapis/google-http-java-client@f84ed59

I have no hard proof for that, but the exception does happen when closing the stream:

 Caused by: org.apache.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 2; received: 0)
 	at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:178)
 	at org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:198)
 	at org.apache.http.impl.io.ContentLengthInputStream.close(ContentLengthInputStream.java:101)
 	at org.apache.http.impl.execchain.ResponseEntityProxy.streamClosed(ResponseEntityProxy.java:142)
 	at org.apache.http.conn.EofSensorInputStream.checkClose(EofSensorInputStream.java:228)
 	at org.apache.http.conn.EofSensorInputStream.close(EofSensorInputStream.java:172)
 	at com.google.api.client.http.HttpResponse.ignore(HttpResponse.java:427)
 	at com.google.api.client.http.HttpResponse.disconnect(HttpResponse.java:441)
 	at com.google.cloud.tools.jib.http.Response.close(Response.java:96)

Sorry for the comment spam.

By the way, I really appreciate your help @chanseokoh

@chanseokoh
Copy link
Member

That looks like it! It's from ignore(), and the comment does say that it now closes the connection before closing the input stream.

I think at this point perhaps we should just revert the HTTP libraries to old versions. Maybe the versions used in Jib 2.7.0 to also resolve the perplexing #3058.

Really appreciate your help with debugging this.

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 12, 2021

Just to make absolutely sure, I've just compiled my own version of google-http-client 1.39.2 with googleapis/google-http-java-client@f84ed59 reverted. I can confirm that this commit is the cause of the issue.

@breun
Copy link

breun commented Aug 13, 2021

I also have this issue with the Jib Maven plugin after upgrading from 3.1.2 to 3.1.3, but it seems like you have already found what the issue is.

@chanseokoh
Copy link
Member

@ChristianCiach what's the target registry? And just asking, is there an easy way for me to set it up?

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 13, 2021

@chanseokoh These are some tough questions, because I am not the one who set up the registry. Fortunately I do have SSH access to the server, so let's see what I can found out.

Surprisingly (and sorry for unintentionally lying about this yesterday) the registry itself and nginx are not deployed as containers, but as native packages:

$ /usr/bin/registry --version
/usr/bin/registry github.com/docker/distribution v2.6.2+unknown

$ /usr/sbin/nginx -version
nginx version: nginx/1.20.1

$ httpd -version
Server version: Apache/2.4.6 (CentOS)
Server built:   Nov 16 2020 16:18:20

There are actually two reverse proxies (apache and nginx), because apparently you can never have enough of those. The setup is httpd -> nginx -> registry.

httpd.conf:

Listen 443 https

SSLPassPhraseDialog     exec:/usr/libexec/httpd-ssl-pass-dialog
SSLSessionCache         shmcb:/run/httpd/sslcache(512000)
SSLSessionCacheTimeout  300
SSLRandomSeed           startup file:/dev/urandom  256
SSLRandomSeed           connect builtin
SSLCryptoDevice         builtin

<VirtualHost _default_:443>
    ErrorLog logs/ssl_error_log
    TransferLog logs/ssl_access_log
    LogLevel warn

    SSLEngine on
    # SSLProtocol TLSv1.2 TLSv1.3
    SSLProtocol TLSv1.2
    SSLCipherSuite ECDH+AESGCM:ECDH+CHACHA20:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:!aNULL:!MD5:!DSS

    SSLCertificateFile      /etc/crypt/ansible/default/domain.pem
    SSLCertificateKeyFile   /etc/crypt/ansible/default/domain.key
    SSLCertificateChainFile /etc/crypt/ansible/default/domain-intermediate.pem

    CustomLog logs/ssl_request_log \
            "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"

    # Forward everything to nginx
    AllowEncodedSlashes NoDecode
    ProxyPass        / http://localhost:4887/ nocanon
    ProxyPassReverse / http://localhost:4887/
    ProxyTimeout     900
</VirtualHost>

nginx.conf:

# Set a variable to help us decide if we need to add the
# 'Docker-Distribution-Api-Version' header.
# The registry always sets this header.
# In the case of nginx performing auth, the header is unset
# since nginx is auth-ing before proxying.
map $upstream_http_docker_distribution_api_version $docker_distribution_api_version {
    '' 'registry/2.0';
}

server {
    listen          4887;  # Dial "http" ...!
    server_name     target-registry.local;

    client_max_body_size 1G;

    ##
    # Docker registry
    ##
    # https://docs.docker.com/registry/recipes/nginx/
    location /v2 {
        if ($http_user_agent ~ "^(docker\/1\.(3|4|5(?!\.[0-9]-dev))|Go ).*$" ) {
            return 404;
        }

        client_max_body_size 4G;  # 4096M
        # required to avoid HTTP 411: see Issue #1486 (https://github.com/moby/moby/issues/1486)
        chunked_transfer_encoding on;

        add_header 'Docker-Distribution-Api-Version' $docker_distribution_api_version;

        proxy_pass http://127.0.0.1:5000;
        # proxy_set_header    Host                $http_host;   # required for docker client's sake
        proxy_set_header    Host                $host;   # required for docker client's sake
        proxy_set_header    X-Real-IP           $remote_addr; # pass on real client's IP
        proxy_set_header    X-Forwarded-For     $proxy_add_x_forwarded_for;
        # Enforce https (see https://github.com/docker/distribution/issues/2225):
        proxy_set_header    X-Forwarded-Proto   https;
        proxy_read_timeout                      900;

        limit_except GET {
            allow [SOME INTERNAL IPS HERE]
            deny all;
        }
    }
}

/etc/docker-distribution/registry/config.yml:

version: 0.1
log:
  fields:
    service: registry
storage:
    cache:
        layerinfo: inmemory
    filesystem:
        rootdirectory: /var/lib/registry
http:
    addr: :5000

That being said, I can't see anything too exotic or surprising here. When viewed from the outside (using curl) everything looks as expected.

@ChristianCiach
Copy link
Contributor

Maybe this is the crucial part to be able to reproduce it?:

        # required to avoid HTTP 411: see Issue #1486 (https://github.com/moby/moby/issues/1486)
        chunked_transfer_encoding on;

@chanseokoh
Copy link
Member

chanseokoh commented Aug 13, 2021

@ChristianCiach It seems infeasible for me to reconstruct your environment :( Instead, can you try these on your machine?

git clone https://github.com/chanseokoh/docker-hub-upload-speed-test
cd docker-hub-upload-speed-test
git checkout jib-i3409-debug  # make sure you checkout this branch
mvn compile
# The first program argument to the Main is the URL to test.
mvn -U exec:java -Dexec.mainClass=GoogleHttpClientMain -Dexec.args="https://target-registry.local/v2/"

All it does is to make a GET request using Google HTTP 1.39.2 like how Jib does.

    HttpResponse res = reqFactory.buildGetRequest(new GenericUrl(args[0])).execute();
    res.disconnect(); // Will this fail as in Jib 3.1.3?
    System.out.println("Didn't fail :(");

@ChristianCiach
Copy link
Contributor

There you go:

> mvn -U exec:java -Dexec.mainClass=GoogleHttpClientMain -Dexec.args="https://target-registry.local/v2/"
[INFO] Scanning for projects...
[INFO] 
[INFO] -----------------< GoogleHttpClient:GoogleHttpClient >------------------
[INFO] Building GoogleHttpClient 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ GoogleHttpClient ---
[WARNING] 
org.apache.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 2; received: 0)
    at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:178)
    at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:198)
    at org.apache.http.impl.io.ContentLengthInputStream.close (ContentLengthInputStream.java:101)
    at org.apache.http.impl.execchain.ResponseEntityProxy.streamClosed (ResponseEntityProxy.java:142)
    at org.apache.http.conn.EofSensorInputStream.checkClose (EofSensorInputStream.java:228)
    at org.apache.http.conn.EofSensorInputStream.close (EofSensorInputStream.java:172)
    at com.google.api.client.http.HttpResponse.ignore (HttpResponse.java:427)
    at com.google.api.client.http.HttpResponse.disconnect (HttpResponse.java:441)
    at GoogleHttpClientMain.main (GoogleHttpClientMain.java:36)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:829)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  0.600 s
[INFO] Finished at: 2021-08-13T21:12:58+02:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project GoogleHttpClient: An exception occured while executing the Java class. Premature end of Content-Length delimited message body (expected: 2; received: 0) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

I do have an idea how you should be able to recreate the issue, just by using a simple docker-compose setup with two containers. Give me a few minutes to try that...

@chanseokoh
Copy link
Member

chanseokoh commented Aug 13, 2021

Thanks for trying! Much appreciated. I was also trying to reproduce this with a simple HTTP server locally, but the problem so far is that read() in the Apache library doesn't return -1 (which is a prerequisite for this issue); instead, the read() throws SocketException: socket closed immediately (then it is consumed and there's no error in the end).

At least it's good to confirm this is reproducible with simple reqFactory.buildGetRequest(new GenericUrl(url).execute().disconnect()), so it's not that Jib is doing something wrong. But it's hard for me to tell whether (1) this is really a problem of the recent change in disconnect() of the Google HTTP client; or (2) your server isn't really conforming to some spec (that curl is tolerant about) and it's right for the Apache client to report this error.

@ChristianCiach
Copy link
Contributor

By the way, the new retry mechanism is annoying. I am using allowInsecureRegistries=true for local testing, but it takes multiple minutes for Jib to finally switch from https to plain http:

Base image 'amazoncorretto:16' does not use a specific image digest - build may not be reproducible
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will NOT be retried
Cannot verify server at https://localhost:5000/v2/. Attempting again with no TLS verification.
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will be retried
GET https://localhost:5000/v2/ failed and will NOT be retried
Failed to connect to https://localhost:5000/v2/ over HTTPS. Attempting again with HTTP.

Sorry for being off-topic, but this is the reason why it takes some time to create a reproducable example for you.

or (2) your server isn't really conforming to some spec (that curl is tolerant about) and it's right for the Apache client to report this error.

I am still sceptical about that. Even with verbose output of curl I don't see anything that seems wrong with the server response.

@chanseokoh
Copy link
Member

chanseokoh commented Aug 13, 2021

Ah, I failed to anticipate the problem of retrying when allowInsecureRegistries=true. Need to think about how to resolve it. (#3422)

About retrying, maybe you can use Jib 3.1.2 while forcing Google HTTP 1.39.2?

@chanseokoh
Copy link
Member

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 13, 2021

Incredible... I can reproduce the issue, but only with ancient Apache httpd versions (up to 2.4.18, works starting with 2.4.20 (there doesn't seem to be a 2.4.19)) and only when using SSL (dummy certs work fine to reproduce the issue).

It has nothing to do with the nginx or the registry. Even when just serving files, Google-HttpClient requesting https://localhost:8443 running an ancient Apache httpd will trigger the issue:

> mvn -U exec:java -Dexec.mainClass=GoogleHttpClientMain -Dexec.args="https://localhost:8443/"
[INFO] Scanning for projects...
[INFO] 
[INFO] -----------------< GoogleHttpClient:GoogleHttpClient >------------------
[INFO] Building GoogleHttpClient 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ GoogleHttpClient ---
[WARNING] 
org.apache.http.ConnectionClosedException: Premature end of Content-Length delimited message body (expected: 45; received: 0)
    at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:178)
    at org.apache.http.impl.io.ContentLengthInputStream.read (ContentLengthInputStream.java:198)
    at org.apache.http.impl.io.ContentLengthInputStream.close (ContentLengthInputStream.java:101)
    at org.apache.http.conn.BasicManagedEntity.streamClosed (BasicManagedEntity.java:166)
    at org.apache.http.conn.EofSensorInputStream.checkClose (EofSensorInputStream.java:228)
    at org.apache.http.conn.EofSensorInputStream.close (EofSensorInputStream.java:172)
    at com.google.api.client.http.HttpResponse.ignore (HttpResponse.java:427)
    at com.google.api.client.http.HttpResponse.disconnect (HttpResponse.java:441)
    at GoogleHttpClientMain.main (GoogleHttpClientMain.java:43)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:829)

Of course, I had to hack your client from https://github.com/chanseokoh/docker-hub-upload-speed-test to ignore the invalid ssl cert:

  private static HttpRequestFactory reqFactory;
  static {
    try {
      reqFactory = new ApacheHttpTransport.Builder().doNotValidateCertificate().build().createRequestFactory();
    } catch(Exception e) {
      throw new RuntimeException(e);
    }
  }

I don't know how to interpret this. The issue only appears with ancient HTTP servers with SSL enabled, but this doesn't necessarily mean that the server is at fault.

You should be able to easily reproduce this example following these steps:

  1. Extract the default httpd.conf from an ancient httpd container:
docker run --rm httpd:2.4.18 cat /usr/local/apache2/conf/httpd.conf > httpd.conf
  1. Edit the config to include these lines:
Listen 443
LoadModule ssl_module modules/mod_ssl.so

<VirtualHost _default_:443>
    SSLEngine on
    SSLProtocol TLSv1.2
    SSLCipherSuite ECDH+AESGCM:ECDH+CHACHA20:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:!aNULL:!MD5:!DSS

    SSLCertificateFile      /server.crt
    SSLCertificateKeyFile   /server.key
</VirtualHost>
  1. Start the container:
docker run --rm -p 8443:443 -v $PWD/server.crt:/server.crt -v $PWD/server.key:/server.key -v $PWD/httpd.conf:/usr/local/apache2/conf/httpd.conf httpd:2.4.18
  1. Request https://localhost:8443 using the current version Google HttpClient as shown above.

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 13, 2021

I want to mention again that I've never seen the error message Premature end of chunk coded message body from the original description of this Github issue. While I think that these issues are closely related, the other error message probably needs a different server setup to reproduce.

@ChristianCiach
Copy link
Contributor

ChristianCiach commented Aug 13, 2021

One more comment:

The issue can also be seen by just connecting your client from https://github.com/chanseokoh/docker-hub-upload-speed-test (branch jib-i3409-debug) to https://eu.gcr.io/ (as shown in the original issue description):

mvn -U exec:java -Dexec.mainClass=GoogleHttpClientMain -Dexec.args="https://eu.gcr.io/"
[INFO] Scanning for projects...
[INFO] 
[INFO] -----------------< GoogleHttpClient:GoogleHttpClient >------------------
[INFO] Building GoogleHttpClient 0.0.1-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO] 
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ GoogleHttpClient ---
[WARNING] 
org.apache.http.TruncatedChunkException: Truncated chunk (expected size: 2.735; actual size: 295)
    at org.apache.http.impl.io.ChunkedInputStream.read (ChunkedInputStream.java:197)
    at org.apache.http.impl.io.ChunkedInputStream.read (ChunkedInputStream.java:210)
    at org.apache.http.impl.io.ChunkedInputStream.close (ChunkedInputStream.java:312)
    at org.apache.http.conn.BasicManagedEntity.streamClosed (BasicManagedEntity.java:166)
    at org.apache.http.conn.EofSensorInputStream.checkClose (EofSensorInputStream.java:228)
    at org.apache.http.conn.EofSensorInputStream.close (EofSensorInputStream.java:172)
    at com.google.api.client.http.HttpResponse.ignore (HttpResponse.java:427)
    at com.google.api.client.http.HttpResponse.disconnect (HttpResponse.java:441)
    at GoogleHttpClientMain.main (GoogleHttpClientMain.java:43)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:566)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:829)

Have a nice weekend!!

Edit: Funny, the stacktrace is different.

@chanseokoh
Copy link
Member

@ChristianCiach that is an incredible detective work!

The issue can also be seen by just connecting... to https://eu.gcr.io/:

I was really surprised to hear this, because I did try gcr.io and didn't hit the original issue. Turns out, running the code inside my IDE doesn't trigger it, but it does when running from the command-line! Now it became apparent that this is a critical issue that the Google HTTP library should fix.

@chanseokoh
Copy link
Member

Downgraded the Google HTTP libraries versions to 1.34.0 in #3415 to avoid this issue. Closing.

@chanseokoh
Copy link
Member

chanseokoh commented Aug 17, 2021

@rmaclean-ee @ChristianCiach @steven-sheehy @danilofnmoreira @abithgs @creekpld @yuri-sergiichuk @nhmarujo
@ikorolev93 @olivierchedru @osigida

We just released Jib plugins 3.1.4 which downgraded Google HTTP libraries to 1.34.0. I expect the new release will avoid the issue, unless Gradle overrides the versions. If Gradle overrides them (which often happens unfortunately), one option is to force versions.

@isptech
Copy link

isptech commented Aug 20, 2021

I experienced a similar issue to this with Gradle -- even after downgrading jib to an earlier version, the output from --debug mode still showed Jib using the newer, 1.39+ version of Google HTTP Client libraries.

One of our other dependencies in our Java app had upgraded to the newer Google HTTP library, and this was overwriting what Jib/Gradle were using. Adjusting our dependency there resolved the issue.

@chanseokoh
Copy link
Member

@isptech yeah, in Gradle we often see dependencies are easily overridden with newer versions. You said you resolved the issue, but I just wanted to let you know that you can always force certain versions as a last resort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants